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Abstract 


Any proof of P 4 NP will have to overcome two barriers: relativization and natural proofs. 
Yet over the last decade, we have seen circuit lower bounds (for example, that PP does not 
have linear-size circuits) that overcome both barriers simultaneously. So the question arises of 
whether there is a third barrier to progress on the central questions in complexity theory. 

In this paper we present such a barrier, which we call algebraic relativization or algebriza- 
tion. The idea is that, when we relativize some complexity class inclusion, we should give the 
simulating machine access not only to an oracle A, but also to a low-degree extension of A over 
a finite field or ring. 

We systematically go through basic results and open problems in complexity theory to delin- 
eate the power of the new algebrization barrier. First, we show that all known non-relativizing 
results based on arithmetization—both inclusions such as IP = PSPACE and MIP = NEXP, and 
separations such as MAexp ¢ P/poly —do indeed algebrize. Second, we show that almost all of 
the major open problems—including P versus NP, P versus RP, and NEXP versus P/poly—will 
require non-algebrizing techniques. In some cases algebrization seems to explain exactly why 
progress stopped where it did: for example, why we have superlinear circuit lower bounds for 
PromiseMA but not for NP. 

Our second set of results follows from lower bounds in a new model of algebraic query com- 
plexity, which we introduce in this paper and which is interesting in its own right. Some of 
our lower bounds use direct combinatorial and algebraic arguments, while others stem from a 
surprising connection between our model and communication complexity. Using this connec- 
tion, we are also able to give an MA-protocol for the Inner Product function with O (yn log n) 
communication (essentially matching a lower bound of Klauck), as well as a communication 
complexity conjecture whose truth would imply NL Æ NP. 


1 Introduction 


In the history of the P versus NP problem, there were two occasions when researchers stepped 
back, identified some property of almost all the techniques that had been tried up to that point, 
and then proved that no technique with that property could possibly work. These “meta-discoveries” 
constitute an important part of what we understand about the P versus NP problem beyond what 
was understood in 1971. 

The first meta-discovery was relativization. In 1975, Baker, Gill, and Solovay [5] showed 
that techniques borrowed from logic and computability theory, such as diagonalization, cannot 
be powerful enough to resolve P versus NP. For these techniques would work equally well in a 
“relativized world,” where both P and NP machines could compute some function f in a single 
time step. However, there are some relativized worlds where P = NP, and other relativized worlds 
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where P ANP. Therefore any solution to the P versus NP problem will require non-relativizing 
techniques: techniques that exploit properties of computation that are specific to the real world. 

The second meta-discovery was natural proofs. In 1993, Razborov and Rudich [35] analyzed the 
circuit lower bound techniques that had led to some striking successes in the 1980’s, and showed 
that, if these techniques worked to prove separations like P Æ NP, then we could turn them around 
to obtain faster ways to distinguish random functions from pseudorandom functions. But in that 
case, we would be finding fast algorithms for some of the very same problems (like inverting one-way 
functions) that we wanted to prove were hard. 


1.1 The Need for a New Barrier 


Yet for both of these barriers—relativization and natural proofs—we do know ways to circumvent 
them. 

In the early 1990’s, researchers managed to prove IP = PSPACE [27, 37] and other celebrated 
theorems about interactive protocols, even in the teeth of relativized worlds where these theorems 
were false. To do so, they created a new technique called arithmetization. The idea was that, 
instead of treating a Boolean formula y as just a black box mapping inputs to outputs, one can 
take advantage of the structure of y, by “promoting” its AND, OR, or NOT gates to arithmetic 
operations over some larger field F. One can thereby extend » to a low-degree polynomial ¢ : 
F” — F, which has useful error-correcting properties that were unavailable in the Boolean case. 

In the case of the natural proofs barrier, a way to circumvent it was actually known since the 
work of Hartmanis and Stearns [18] in the 1960’s. Any complexity class separation proved via 
diagonalization—such as P # EXP or ©5*P ¢ P/poly [23|—is inherently non-naturalizing. For 
diagonalization zeroes in on a specific property of the function f being lower-bounded—namely, 
the ability of f to simulate a whole class of machines—and thereby avoids the trap of arguing that 
“f is hard because it looks like a random function.” 

Until a decade ago, one could at least say that all known circuit lower bounds were subject 
either to the relativization barrier, or to the natural proofs barrier. But not even that is true 
any more. We now have circuit lower bounds that evade both barriers, by cleverly combining 
arithmetization (which is non-relativizing) with diagonalization (which is non-naturalizing). 

The first such lower bound was due to Buhrman, Fortnow, and Thierauf [8], who showed that 
MAexp, the exponential-time analogue of MA, is not in P/poly. To prove that their result was 
non-relativizing, Buhrman et al. also gave an oracle A such that MAp C P4/poly. Using similar 
ideas, Vinodchandran [41] showed that for every fixed k, the class PP does not have circuits of 
size nř; and Aaronson [1] showed that Vinodchandran’s result was non-relativizing, by giving an 
oracle A such that PP C SIZE4(n). Recently, Santhanam [36] gave a striking improvement of 
Vinodchandran’s result, by showing that for every fixed k, the class PromiseMA does not have 
circuits of size n". 

As Santhanam [36] stressed, these results raise an important question: given that current tech- 
niques can already overcome the two central barriers of complexity theory, how much further can 
one push those techniques? Could arithmetization and diagonalization already suffice to prove 
circuit lower bounds for NEXP, or even P Æ NP? Or is there a third barrier, beyond relativization 
and natural proofs, to which even the most recent results are subject? 
































1.2 Our Contribution 


In this paper we show that there is, alas, a third barrier to solving P versus NP and the other 
central problems of complexity theory. 


Recall that a key insight behind the non-relativizing interactive proof results was that, given 
a Boolean formula y, one need not treat y as merely a black box, but can instead reinterpret it 
as a low-degree polynomial ¢ over a larger field or ring. To model that insight, in this paper 
we consider algebraic oracles: oracles that can evaluate not only a Boolean function f, but also a 
low-degree extension f of f over a finite field or the integers. We then define algebrization (short 
for “algebraic relativization”), the main notion of this paper. 

Roughly speaking, we say that a complexity class inclusion C C D algebrizes if C4 C p* for all 
oracles A and all low-degree extensions A of A. Likewise, a separation C ¢ D algebrizes if C4 ¢ D4 
for all AA. Notice that algebrization is defined differently for inclusions and separations; and 
that in both cases, only one complexity class gets the algebraic oracle A, while the other gets the 
Boolean version A. These subtle asymmetries are essential for this new notion to capture what we 
want, and will be explained in Section 2. 

We will demonstrate how algebrization captures a new barrier by proving two sets of results. 
The first set shows that, of the known results based on arithmetization that fail to relativize, all of 
them algebrize. This includes the interactive proof results, as well as their consequences for circuit 
lower bounds. More concretely, in Section 3 we show (among other things) that, for all oracles A 
and low-degree extensions A of A: 


e PSPACE4 C IP4 

e NEXP4 C MIP4 

° MAp É P4 /poly 

e PromiseMA4 ¢ SIZE^ (n*) 


The second set of results shows that, for many basic complexity questions, any solution will 
require non-algebrizing techniques. In Section 5 we show (among other things) that there exist 
oracles A, A relative to which: 


e NP4 C P4, and indeed PSPACE4 C P4 

e NP4 ¢ PA, and indeed RP4 ¢ På 

e NP4 ¢ BPP4, and indeed NP4 ¢ BQP4 and NP4 ¢ coMA4 
e pNP“ g ppA 

e NEXP4 c P4/poly 

e NPA c SIZE4 (n) 


These results imply, in particular, that any resolution of the P versus NP problem will need to 
use non-algebrizing techniques. But the take-home message for complexity theorists is stronger: 
non-algebrizing techniques will be needed even to derandomize RP, to separate NEXP from P/poly, 
or to prove superlinear circuit lower bounds for NP. 

By contrast, recall that the separations MAgxp É P/poly and PromiseMA ¢ SIZE (n*) have 
already been proved with algebrizing techniques. Thus, we see that known techniques can prove 
superlinear circuit lower bounds for PromiseMA, but cannot do the same for NP—even though 
MA = NP under standard hardness assumptions [26]. Similarly, known techniques can prove 
superpolynomial circuit lower bounds for MAgxp but not for NEXP. To summarize: 


Algebrization provides nearly the precise limit on the non-relativizing techniques of the 
last two decades. 


We speculate that going beyond this limit will require fundamentally new methods.! 


1.3 Techniques 


This section naturally divides into two, one for each of our main sets of results. 


1.3.1 Proving That Existing Results Algebrize 


Showing that the interactive proof results algebrize is conceptually simple (though a bit tedious in 
some cases), once one understands the specific way these results use arithmetization. In our view, 
it is the very naturalness of the algebrization concept that makes the proofs so simple. 

To illustrate, consider the result of Lund, Fortnow, Karloff, and Nisan [27] that coNP CIP. In 
the LFKN protocol, the verifier (Arthur) starts with a Boolean formula y, which he arithmetizes to 
produce a low-degree polynomial Ø : F” — F. The prover (Merlin) then wants to convince Arthur 
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To do so, Merlin engages Arthur in a conversation about the sums of @ over various subsets of 
points in F”. For almost all of this conversation, Merlin is “doing the real work.” Indeed, the only 
time Arthur ever uses his description of ¢ is in the very last step, when he checks that ¢ (r1,.-., Tn) 
is equal to the value claimed by Merlin, for some random field elements r1,..., 7, chosen earlier in 
the protocol. 














Now suppose we want to prove coNP4 C IP4. The only change is that now Arthur’s formula p 
will in general contain A gates, in addition to the usual AND, OR, and NOT gates. And therefore, 
when Arthur arithmetizes y to produce a low-degree polynomial ¢, his description of ¢ will contain 
terms of the form A(z,...,2,). Arthur then faces the problem of how to evaluate these terms 
when the inputs z,,...,z,% are non-Boolean. At this point, though, the solution is clear: Arthur 
simply calls the oracle A to get A(a, Santer 

_While the details are slightly more complicated, the same idea can be used to show PS PACE4 C 
IP and NEXP4 C MIP“. 

But what about the non-relativizing separation results, like MAgxp ¢ P/poly? When we ex- 
amine the proofs of these results, we find that each of them combines a single non-relativizing 
ingredient—namely, an interactive proof result—with a sequence of relativizing results. Therefore, 
having shown that the interactive proof results algebrize, we have already done most of the work 
of showing the separations algebrize as well. 


1.3.2 Proving The Necessity of Non-Algebrizing Techniques 


It is actually easy to show that any proof of NP ¢ P will need non-algebrizing techniques. One 
simply lets A be a PSPACE-complete language and A be a PSPACE-complete extension of A; then 


NP4 = P4 = PSPACE. What is harder is to show that any proof of RP C P, NP C BPP, and 
lWhile we have shown that most non-relativizing results algebrize, we note that we have skipped some famous 


examples—involving small-depth circuits, time-space tradeoffs for SAT, and the like. We discuss some of these 
examples in Section 9. 


so on will need non-algebrizing techniques. For the latter problems, we are faced with the task 
of proving algebraic oracle separations. In other words, we need to show (for example) that there 


exist oracles A, A such that RP4 ¢ P4 and NP4 ¢ BPP4, 
Just like with standard oracle separations, to prove an algebraic oracle separation one has to 
do two things: 


(1) Prove a concrete lower bound on the query complexity of some function. 


(2) Use the query complexity lower bound to diagonalize against a class of Turing machines. 


Step (2) is almost the same for algebraic and standard oracle separations; it uses the bounds 
from (1) in a diagonalization argument. Step (1), on the other hand, is extremely interesting; it 
requires us to prove lower bounds in a new model of algebraic query complexity. 

In this model, an algorithm is given oracle access to a Boolean function A: {0,1}” — {0,1}. 
It is trying to answer some question about A—for example, “is there an x € {0,1}" such that 
A (a) = 1?”—by querying A on various points. The catch is that the algorithm can query not just 
A itself, but also an adversarially-chosen low-degree extension A: F” — F of A over some finite 
field F.2 In other words, the algorithm is no longer merely searching for a needle in a haystack: it 
can also search a low-degree extension of the haystack for “nonlocal clues” of the needle’s presence! 

This model is clearly at least as strong as the standard one, since an algorithm can always 
restrict itself to Boolean queries only (which are answered identically by A and A). Furthermore, 
we know from interactive proof results that the new model is sometimes much stronger: sampling 
points outside the Boolean cube does, indeed, sometimes help a great deal in determining properties 
of A. This suggests that, to prove lower bounds in this model, we are going to need new techniques. 

In this paper we develop two techniques for lower-bounding algebraic query complexity, which 
have complementary strengths and weaknesses. 

The first technique is based on direct construction of adversarial polynomials. Suppose an 
algorithm has queried the points y1,...,y% E€ F”. Then by a simple linear algebra argument, 
it is possible to create a multilinear polynomial p that evaluates to 0 on all the y;’s, and that 
simultaneously has any values we specify on 2” — t points of the Boolean cube. The trouble is that, 
on the remaining t Boolean points, p will not necessarily be Boolean: that is, it will not necessarily 
be an extension of a Boolean function. We solve this problem by multiplying p with a second 
multilinear polynomial, to produce a “multiquadratic” polynomial (a polynomial of degree at most 
2 in each variable) that is Boolean on the Boolean cube and that also has the desired adversarial 
behavior. 

The idea above becomes more complicated for randomized lower bounds, where we need to 
argue about the indistinguishability of distributions over low-degree polynomials conditioned on a 
small number of queries. And it becomes more complicated still when we switch from finite field 
extensions to extensions A : Z” > Z over the integers. In the latter case, we can no longer use linear 
algebra to construct the multilinear polynomial p, and we need to compensate by bringing in some 
tools from elementary number theory, namely Chinese remaindering and Hensel lifting. Even then, 
a technical problem (that the number of bits needed to express A (x) grows with the running times 
of the machines being diagonalized against) currently prevents us from turning query complexity 
lower bounds obtained by this technique into algebraic oracle separations over the integers. 

Our second lower-bound technique comes as an “unexpected present” from communication 
complexity. Given a Boolean function A : {0,1}" — {0,1}, let Ap and A; be the subfunctions 














?Later, we will also consider extensions over the integers. 


obtained by fixing the first input bit to 0 or 1 respectively. Also, suppose Alice is given the truth 
table of Ap, while Bob is given the truth table of A;. Then we observe the following connection 
between algebraic query complexity and communication complexity: 


If some property of A can be determined using T queries to a multilinear extension 
A of A over the finite field F, then it can also be determined by Alice and Bob using 
O (Tn log |F|) bits of communication. 














This connection is extremely generic: it lets us convert randomized algorithms querying A into 
randomized communication protocols, quantum algorithms into quantum protocols, MA-algorithms 
into MA-protocols, and so on. Turning the connection around, we find that any communication 
complexity lower bound automatically leads to an algebraic query complexity lower bound. This 
means, for example, that we can use celebrated lower bounds for the Disjointness problem [33, 22, 


25, 34] to show that there exist oracles A, A relative to which NP4 É BPPA, and even NP4 ¢ BQP 


and NP“ A coMA“. For the latter two results, we do not know of any proof by direct construction 
of polynomials. 

The communication complexity technique has two further advantages: it yields multilinear 
extensions instead of multiquadratic ones, and it works just as easily over the integers as over finite 
fields. On the other hand, the lower bounds one gets from communication complexity are more 
contrived. For example, one can show that solving the Disjointness problem requires exponentially 
many queries to A, but not that finding a Boolean x such that A(x) = 1 does. Also, we do not 


know how to use communication complexity to construct A,A such that NEXP4 c P4 /poly and 
NP4 c SIZE“ (n). 


1.4 Related Work 


In a survey article on “The Role of Relativization in Complexity Theory,” Fortnow [13] defined a 
class of oracles O relative to which IP = PSPACE. His proof that IP4 = PSPACE4 for all A € O 
was similar to our proof, in Section 3.2, that IP = PSPACE algebrizes. However, because he wanted 
both complexity classes to have access to the same oracle A, Fortnow had to define his oracles in 
a subtle recursive way, as follows: start with an arbitrary Boolean oracle B, then let B be the 
multilinear extension of B, then let f be the “Booleanization” of B (i.e., f (x, i) is the i” bit in 


the binary representation of B (x)), then let B be the multilinear extension of f, and so on ad 
infinitum. Finally let A be the concatenation of all these oracles. 

As we discuss in Section 10.1, it seems extremely difficult to prove separations relative to these 
recursively-defined oracles. So if the goal is to show the limitations of current proof techniques 
for solving open problems in complexity theory, then a non-recursive definition like ours seems 
essential. 

Recently (and independently of us), Juma, Kabanets, Rackoff and Shpilka [21] studied an 
algebraic query complexity model closely related to ours, and proved lower bounds in this model. 
In our terminology, they “almost” constructed an oracle A, and a multiquadratic extension A of 
A, such that #P4 É FP4/poly.3 Our results in Section 4 extend those of Juma et al. and solve 
some of their open problems. 


3We say “almost” because they did not ensure A (x) was Boolean for all Boolean x; this is an open problem of 
theirs that we solve in Section 4.2.1. Also, their result is only for field extensions and not integer extensions. 


Juma et al. also made the interesting observation that, if the extension A is multilinear rather 
than multiquadratic, then oracle access to A sometimes switches from being useless to being ex- 
traordinarily powerful. For example, let A : {0,1}" — {0,1} be a Boolean function, and let 
A : F”? — F be the multilinear extension of A, over any field F of characteristic other than 2. Then 
we can evaluate the sum Dre {0,1}" A(x) with just a single query to A, by using the fact that 
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This observation helps to explain why, in Section 4, we will often need to resort to multiquadratic 
extensions instead of multilinear ones. 


1.5 Table of Contents 


The rest of the paper is organized as follows. 


Section 2 Formal definition of algebraic oracles, and various subtleties of the model 
Section 3 Why known results such as IP = PSPACE and MIP = NEXP algebrize 
Section 4 Lower bounds on algebraic query complexity 

Section 5 Why open problems will require non-algebrizing techniques to be solved 
Section 6 Generalizing to low-degree extensions over the integers 

Section 7 Two applications of algebrization to communication complexity 

Section 8 The GMW zero-knowledge protocol for NP, and why it algebrizes 

Section 9 Whether we have non-relativizing techniques besides arithmetization 
Section 10 Two ideas for going beyond the algebrization barrier and their limitations 
Section 11 Conclusions and open problems 


Also, the following table lists our most important results and where to find them. 


Result Theorem(s) 
IP = PSPACE algebrizes 3.7 

MIP = NEXP algebrizes 3.8 
Recent circuit lower bounds like MAgexp É P/poly algebrize 3.16-3.18 
Lower bound on algebraic query complexity (deterministic, over fields) 4.4 
Lower bound on algebraic query complexity (probabilistic, over fields) 4.9 
Communication lower bounds imply algebraic query lower bounds 4.11 
Proving P Æ NP will require non-algebrizing techniques 5.1 
Proving P = NP (or P = RP) will require non-algebrizing techniques 5.3 
Proving NP C BPP (or NP C P/poly) will require non-algebrizing techniques 5.4 
Proving NEXP ¢ P/poly will require non-algebrizing techniques 5.6 
Proving NP C BQP, BPP = BQP, etc. will require non-algebrizing techniques 5.11 
Lower bound on algebraic query complexity (deterministic, over integers) 6.10 
Plausible communication complexity conjecture implying NL Æ NP 7.2 
Inner Product admits an MA-protocol with O (,/nlogn) communication 7.4 

The GMW Theorem algebrizes 8.4 


2 Oracles and Algebrization 


In this section we discuss some preliminaries, and then formally define the main notions of the 
paper: extension oracles and algebrization. 

We use [t] to denote the set {1,..., t}. See the Complexity Zoo* for definitions of the complexity 
classes we use. 

Given a multivariate polynomial p (x1,..., £n), we define the multidegree of p, or mdeg (p), to 
be the maximum degree of any 2;. We say p is multilinear if mdeg (p) < 1, and multiquadratic 
if mdeg (p) < 2. Also, we call p an extension polynomial if p(x) € {0,1} whenever x € {0,1}”. 
Intuitively, this means that p is the polynomial extension of some Boolean function f : {0,1}" > 
{0,1}. 

The right way to relativize complexity classes such as PSPACE and EXP has long been a subject 
of dispute: should we allow exponentially-long queries to the oracle, or only polynomially-long 
queries? On the one hand, if we allow exponentially-long queries, then statements like “IP = 
PSPACE is non-relativizing” are reduced to trivialities, since the PSPACE machine can simply 
query oracle bits that the IP machine cannot reach. Furthermore the result of Chandra, Kozen, 
and Stockmeyer [11] that APSPACE = EXP becomes non-relativizing, which seems perverse. On 
the other hand, if we allow only polynomially-long queries, then results based on padding—for 
example, P = NP EXP = NEXP—will generally fail to relativize.5 

In this paper we adopt a pragmatic approach, writing C4 or C4PolY] to identify which convention 
we have in mind. More formally: 








Definition 2.1 (Oracle) An oracle A is a collection of Boolean functions Am : {0,1} — {0,1}, 
one for each m € N. Then given a complexity class C, by C^ we mean the class of languages 
decidable by a C machine that can query Am for any m of its choice. By CAlPyl we mean the 
class of languages decidable by a C machine that, on inputs of length n, can query Am for any 
m = O (poly (n)). For classes C such that all computation paths are polynomially bounded (for 
example, P, NP, BPP, #P...), it is obvious that C4P®] = C4, 


We now define the key notion of an extension oracle. 


Definition 2.2 (Extension Oracle Over A Finite Field) Let Am : {0,1} — {0,1} be a Boolean 


x 


function, and let F be a finite field. Then an extension of Am over F is a polynomial Am p : F™ — F 














such that AmE (x) = Am (x) whenever x € {0,1}. Also, given an oracle A = (Am), an extension 


A of A is a collection of polynomials Amr : F” — F, one for each positive integer m and finite 
field F, such that 














(i) AmF is an extension of Am for allm,F, and 


(ii) there exists a constant c such that mdeg(Am Fr) < c for all m, 
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Then given a complexity class C, by CA we mean the class of languages decidable by a C ma- 














chine that can query Amp for any integer m and finite field F. By CAlpoly] we mean the class of 
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>Indeed, let A be any PSPACE-complete language. Then P^ = NP“, but EXP“!P°lyl — NEXP4lPo if and only if 
EXP = NEXP in the unrelativized world. 

® All of our results would work equally well if we instead chose to limit mdeg(Am r) by a linear or polynomial 


function of m. On the other hand, nowhere in this paper will mdeg(Am,r) need to be greater than 2. 


languages decidable by a C machine that, on inputs of length n, can query Amr for any integer 
m = O (poly (n)) and finite field with |F| = 2200. 














We use mdeg(A) to denote the maximum multidegree of any Ay. 

For most of this paper, we will restrict ourselves to extensions over finite fields, as they are easier 
to work with than integer extensions and let us draw almost the same conceptual conclusions. We 
note that many of our results—including all results showing that existing results algebrize, and all 
oracle separations proved via communication complexity—easily carry over to the integer setting. 
Furthermore, even our oracle separations proved via direct construction can be “partly” carried 
over to the integer setting. Section 6 studies integer extensions in more detail. 


Definition 2.3 (Algebrization) We say the complexity class inclusion C C D algebrizes pCa c 
D4 for all oracles A and all finite field extensions A of A. Likewise, we say that C C D does not 
algebrize, or that proving C C D would require non-algebrizing techniques, if there exist A, A such 
that C^ £ DA. : 

We say the separation C ¢ D algebrizes if C4 ¢ D4 for all A, A. Likewise, we say that C t D 
does not algebrize, or that proving C ¢ D would require non-algebrizing techniques, if there exist 
AA such that C4 C DA, 


When we examine the above definition, two questions arise. First, why can one complexity 
class access the extension A, while the other class can only access the Boolean part A? And second, 
why is it the “right-hand class” that can access A for inclusions, but the “left-hand class” that can 
access A for separations? 

The basic answer is that, under a more stringent notion of algebrization, we would not know 
how to prove that existing interactive proof results algebrize. So for example, while we will prove 
that PSPACE“ Pow] C IPA for all oracles A and extensions A of A, we do not know how to prove 
that PSPACE“IPoy] — IPA for all A. 

Note that for our separation results, this issue seems to make no difference. For example, in 
Section 5 we will construct oracles A, B and extensions A,B, such that not only P4 = NP4 and 
PB Æ NP®, but also NP C P4 and NP? ¢ PB. This implies that, even under our “broader” 
notion of algebrization, any resolution of the P versus NP problem will require non-algebrizing 
techniques. 


3 Why Existing Techniques Algebrize 


In this section, we go through a large number of non-relativizing results in complexity theory, 
and explain why they algebrize. The first batch consists of conditional collapses such as P#P c 
P/poly — > P#P = MA, as well as containments such as PSPACE C IP and NEXP C MIP. The 
second batch consists of circuit lower bounds, such as MAgexp ¢ P/poly. 

Note that each of the circuit lower bounds actually has a conditional collapse as its only non- 
relativizing ingredient. Therefore, once we show that the conditional collapses algebrize, we have 
already done most of the work of showing that the circuit lower bounds algebrize as well. 

The section is organized as follows. First, in Section 3.1, we show that the self-correctibility of 
#P, proved by Lund et al. [27], is an algebrizing fact. From this it will follow, for example, that 
for all oracles A and finite field extensions A. 


PP4 c PA/poly = P#P* c MAA, 


Next, in Section 3.2, we reuse results from Section 3.1 to show that the interactive proof results of 
Lund et al. [27] and Shamir [37] algebrize: that is, for all A, A, we have P#P C IPA, and indeed 
PSPACE“IP°)] c IPA, 
Then, in Section 3.3, we sketch an extension to the Babai-Fortnow-Lund theorem [4], giving 
us NEXP4lPoly] C MIP4 for all A, A. The same ideas also yield EXP4P°'] C MIPA\p for all A, A, 
where MIPe_exp is the subclass of MIP with the provers restricted to lie in EXP. This will imply, in 
particular, that 7 7 7 
EXP4lPoly] c P4/poly => EXP4IPoyl c MAA 


for all A, A. 

Section 3.4 harvests the consequences for circuit lower bounds. We show there that the results 
of Vinodchandran [41], Buhrman-Fortnow-Thierauf [8], and Santhanam [36] algebrize: that is, for 
all A, A, 


e ppA ¢ SIZE4 (n*) for all constants k 
e MAÉp Z PA/poly 
e PromiseMA4 ¢ SIZEA (n*) for all constants k 


Finally, Section 3.5 discusses some miscellaneous interactive proof results, including that of 
Impagliazzo, Kabanets, and Wigderson [20] that NEXP C P/poly =» NEXP = MA, and that of 
Feige and Kilian [12] that RG = EXP. 

Throughout the section, we assume some familiarity with the proofs of the results we are 
algebrizing. 


3.1 Self-Correction for #P: Algebrizing 


In this subsection we examine some non-relativizing properties of the classes ##P and PP, and show 
that these properties algebrize. Our goal will be to prove tight results, since that is what we will 
need later to show that Santhanam’s lower bound PromiseMA ¢ SIZE (n*) [36] is algebrizing. The 
need for tightness will force us to do a little more work than would otherwise be necessary. 

The first step is to define a convenient #P-complete problem. 





Definition 3.1 (#FSAT) An FSAT formula over the finite field F, in the variables z1,..., £N, 
is a circuit with unbounded fan-in and fan-out 1, where every gate is labeled with either + or x, and 
every leaf is labeled with either an x; or a constant c € F. Such a formula represents a polynomial 
p: EN —F in the obvious way. The size of the formula is the number of gates. 

Now let #FSATzp be the following problem: given a polynomial p : FY — F specified by an 
FSAT formula of size at most L, evaluate the sum 


S (p) := `> p(z1,..., £N). 


z1,- £y €{0,1} 












































Also, let #FSAT be the same problem but where the input has the form (L,F, p) (i.e., L and F are 
given as part of the input). For the purpose of measuring time complexity, the size of an #FSAT 
instance is defined to be n := Llog |F|. 
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Observe that if p is represented by an FSAT formula of size L, then deg (p) < L. 
It is clear that #F'SAT is #P-complete. Furthermore, Lund, Fortnow, Karloff, and Nisan [27] 
showed that #F SAT is self-correctible, in the following sense: 


Theorem 3.2 ([27]) There exists a polynomial-time randomized algorithm that, given any #F SAT, F 
instance p with char (F) > 3L? and any circuit C: 


(i) Outputs S (p) with certainty if C computes #F SAT_ p. 


(ii) Outputs either S (p) or “FAIL” with probability at least 2/3, regardless of C. 











Now let A be a Boolean oracle, and let Abea low-degree extension of A over F. Then an 
FSAT4 formula is the same as an F'SAT formula, except that in addition to + and x gates we 
also allow A-gates: that is, gates with an arbitrary fan-in h, which take 61,...,b, € F as input and 

















produce An (b,,...,b,) as output. Observe that if p is represented by an FS AT“ formula of size 
L, then deg (p) <I mdeg(A). 
Let #F SAT^ be the same problem as #F SAT, except that the polynomial p is given by an 


FSAT^ formula. Then clearly #F SATA € #P4_ Also: 
Proposition 3.3 #FSATA is #P4-hard under randomized reductions. 


Proof. Let C4 be a Boolean circuit over the variables 21,..., zy, with oracle access to A. Then 
a canonical #P4_hard problem is to compute 


SS ON (z1,..., ZN), 


z1,- 2N €{0,1} 


the number of satisfying assignments of C4. E 

We will reduce this problem to #FSAT^. For each gate g of C, define a variable £g, which 
encodes whether g outputs 1. Then the polynomial p will simply be a product of terms that enforce 
“correct propagation” through the circuit. For example, if g computes the AND of gates i and j, 
then we encode the constraint x, = x; \ x; by the term 


tariy (l =a) lea gene): 


Likewise, if g is an oracle gate, then we encode the constraint £g = An (Xi,,.-.,4,) by the term 





tgAne (tiii tin) L= ae) (1 An E 


The last step is to find a sufficiently large prime q > 2%, one that will not affect the sum, to take 
as the order of F. This can be done in randomized polynomial time. m 

By contrast, we do not know how to show that #F.SAT A is #P4_hard—intuitively because 
of a #P4 machine’s ability to query the Anr’s in ways that do not respect their structure as 














polynomials. 
We now prove an algebrizing version of Theorem 3.2. 


Theorem 3.4 There exists a BPP algorithm that, given any #FSAT}y instance p with char (F) > 
3L mdeg(A) and any circuit CA: 
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(i) Outputs S (p) if cA computes #FSAT}y. 
(ii) Outputs either S (p) or “FAIL” with probability at least 2/3, regardless of CA, 


Proof. The proof is basically identical to the usual proof of Lund et al. [27] that #F SAT is self- 
correctible: that is, we use the circuit C to simulate the prover in an interactive protocol, whose 
goal is to convince the verifier of the value of S (p). The only difference is that at the final step, 
we get an FSAT“ formula instead of an FS AT formula, so we evaluate that formula with the help 
of the oracle A. 7 

In more detail, we first call C4 to obtain S’, the claimed value of the sum S (p). We then 
define 

pı (x) := `> p(z, £2,..., EN). 


£2,...,0N€E{0,1} 


Then by making deg (p) + 1 more calls to CA, we can obtain p}, the claimed value of pı. We then 
check that S’ = p' (0) +p’ (1). If this test fails, we immediately output “FAIL.” Otherwise we 
choose rı € Fy uniformly at random and set 


p2 (x) := x p (r1, £, Base <6 2N). 


x3,...,0n €{0,1} 














We then use deg (p) + 1 more calls to C4 to obtain ph, the claimed value of p2, and check that 
pi (r1) = ph (0) + ph (1). If this test fails we output “FAIL”; otherwise we choose rz € F uniformly 
at random and set 
p3 (x) = bD p (r1, T2, £, Tap EN), 
£4,- £N E{0,1} 


and continue in this manner until we reach the polynomial 
PN (x) := p (rı, Saa SPN 2) ; 


At this point we can evaluate py (0) and py (1) directly, by using the FS AT4 formula for p together 
with the oracle A. We then check that py; (rn—1) = pn (0) + pn (1). If this final test fails then 
we output “FAIL”; otherwise we output S (p) = S’. 

Completeness and soundness follow by the same analysis as in Lund et al. [27]. First, if C4 
computes #F' SATA, then the algorithm outputs S (p) = S$’ with certainty. Second, if S (p) 4 $", 
then by the union bound, the probability that the algorithm is tricked into outputting S (p) = S’ 
is at most 


L deg (p) $ L? mdeg(A) 1 


char (F) ~ 3L3mdeg(4) 3 

E 

From the self-correcting property of #P-complete problems, Lund et al. [27] deduced the 
corollary that PP C P/poly implies P#P = PP = MA. We now wish to obtain an algebrizing version 
of their result. Thus, let MAJF SATA be the following decision version of #F' SATA: given a 
#FSATA instance (L,F,p), together with an integer k € [char (F)], decide whether S(p) > k 
interpreted as an integer. Then clearly MAJF SAT® is in PP4 and hard for PP4. We will also 
refer to MAJFSAT. a in the case where L and F are fixed. 
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Theorem 3.5 For all A, A and time-constructible functions s, 
MAJFSAT® € SIZE4 (s(n)) => MAJFSATA € MATIMEA (s (n) poly (n)). 
So in particular, if pp4 G PA /poly then P#P* G MAî.T 


Proof. Given a procedure to solve M AJF SATA, it is clear that we can also solve #F’ SAT by 
calling the procedure O (log q) times and using binary search. (This is analogous to the standard 


fact that PPP = P#P.) Soif MAJFSAT4 € SIZE^ (s (n)), then an MA machine can first guess 
a circuit for MAJF SAT fp of size s(n), and then use that circuit to simulate the prover in an 
interactive protocol for MAJF SAT fr, exactly as in Theorem 3.4. This incurs at most polynomial 


blowup, and therefore places MAJFSATA in MATIME* (s (n) poly (n)). 
In particular, if pp4 Cc PA /poly, then MAJFSAT4 is in PA /poly, hence MAJFSAT4 is in 
MAÎ, hence PP“ C MA4, hence PPP* = p#P^ c MAA. m 


3.2 IP =PSPACE: Algebrizing 


Examining the proof of Theorem 3.4, it is not hard to see that the P#P C IP theorem of Lund et 
al. [27] algebrizes as well. 


Theorem 3.6 For all A, A, P#P^ c IPA. 


Proof. It suffices to note that, in the proof of Theorem 3.4, we actually gave an interactive protocol 
for ESATA where the verifier was in BPP“. Since ESATA is #P“_hard by Proposition 3.3, 
this implies the containment p#P^ CIP. m 

Indeed we can go further, and show that the famous IP = PSPACE theorem of Shamir [37] is 
algebrizing. 


Theorem 3.7 For all A, A, PSPACE4P0] c på, 


Proof Sketch. When we generalize the #P protocol of Lund et al. [27] to the PSPACE protocol of 
Shamir [37], the conversation between the prover and verifier becomes somewhat more complicated, 
due to the arithmetization of quantifiers. The prover now needs to prevent the degrees of the 
relevant polynomials from doubling at each iteration, which requires additional steps of degree 
reduction (e.g. “multilinearization” operators). However, the only step of the protocol that is 
relevant for algebrization is the last one, when the verifier checks that p (r1,..., ry) is equal to the 
value claimed by the prover for some r},...,ry € F. And this step can be algebrized exactly as 
in the #P case. m 














"We could have avoided talking about MAJFSAT at all in this theorem, had we been content to show 
that PP4 C SIZE4(s(n)) implies PP C MATIME^ (s (poly (n))). But in that case, when we tried to 
show that Santhanam’s result PromiseMA ¢ SIZE (n*) was algebrizing, we would only obtain the weaker result 


PromiseMATIME4 (nP°88") ¢ SIZE4 (n*). 
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3.3 MIP =NEXP: Algebrizing 


Babai, Fortnow, and Lund [4] showed that MIP = NEXP. In this subsection we will sketch a proof 
that this result algebrizes: 


Theorem 3.8 For all A, A, NEXP4IP°¥l C MIP4, 


To prove Theorem 3.8, we will divide Babai et al.’s proof into three main steps, and show that 
each of them algebrizes. 
The first step is to define a convenient NEXP-complete problem. 


Definition 3.9 (hSAT) Let an h-formula over the variables 11,...,%n € {0,1} be a Boolean 
formula consisting of AND, OR, and NOT gates, as well as gates of fan-in n that compute a 
Boolean function h : {0,1}" — {0,1}. 

Then given an h-formula C”, let hS AT be the problem of deciding whether there exists a Boolean 
function h : {0,14" — {0,1} such that C} (x) = 0 for all x € {0,1}”. 


Babai et al. showed the following: 
Lemma 3.10 ([4]) hS AT is NEXP-complete. 


The proof of this lemma is very simple: h encodes both the nondeterministic guess of the NEXP 
machine on the given input, as well as the entire tableau of the computation with that guess. And 
the extension to circuits with oracle access is equally simple. Let A be a Boolean oracle, and let 
hS AT“ be the variant of hS.AT where the formula C™^4 can contain gates for both h and A. Then 
the first observation we make is that Lemma 3.10 relativizes: hSAT4 is NEXP4lP°y!_complete. 
Indeed, h is constructed in exactly the same way. We omit the details. 

The second step in Babai et al.’s proof is to use the LFKN protocol [27] to verify that C? (x) = 0 
for all x, assuming that the prover and verifier both have oracle access to a low-degree extension 
h:F" > F ofh. 




















Lemma 3.11 ([4]) Let h: F” — F be any low-degree extension of a Boolean function h. Then it 
is possible to verify, in IP”, that C} (x) =0 for all x € {0,1}”. 


Proof Sketch. Observe that if we arithmetize C}, then we get a low-degree polynomial Ch. F° > 














F extending C”. Furthermore, C? can be efficiently evaluated given oracle access to h. So by 
using the LFKN protocol, the verifier can check that 


Sors y ee So 


xe{0,1}” xe{0,1}” 


E 

Our second observation is that Lemma 3.11 algebrizes: if we allow the prover and verifier oracle 
access to any low-degree extension Aof A, then the same protocol works to ensure that C’4 (x) = 0 
for all x € {0,1}”. 7 

In reality, of course, the verifier is not given oracle access to a low-degree extension h. So the 
third step in Babai et al.’s proof is a low-degree test and subsequent self-correction algorithm, which 
allow the verifier to simulate oracle access to h by exchanging messages with two untrustworthy 
provers. 
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Lemma 3.12 ([4]) There exists a BPP? algorithm that, given any oracle B : F” — F and input 
yer": 


(i) Outputs B (y) if B is a low-degree polynomial. 


(ii) Outputs “FAIL” with probability Q (1/ poly (n)) if B differs from all low-degree polynomials 
on aQ(1/poly (n)) fraction of points. 


Combining Lemmas 3.11 and 3.12, we see that the verifier in the LFKN protocol does not need 
the guarantee that the oracle gates in C”, which are supposed to compute h, indeed do so. A 
cheating prover will either be caught, or else the execution will be indistinguishable from one with 
a real h. a 

Our final observation is that Lemma 3.12 deals only with the gates of C computing h, and is 
completely independent of what other gates C has. It therefore algebrizes automatically when we 
switch to circuits containing oracle gates A. This completes the proof sketch of Theorem 3.8. 

We conclude this section by pointing out one additional result. In Babai et al.’s original proof, 
if the language L to be verified is in EXP, then the function h encodes only the tableau of the 
computation. It can therefore be computed by the provers in EXP. Furthermore, if h is in EXP, 
then the unique multilinear extension h : F” — F is also in EXP. So letting MIPexp be the subclass 
of MIP where the provers are in EXP, we get the following consequence: 




















Theorem 3.13 ([4]) MIPexp = EXP. 


Now, it is clear that if L € EXP4!P°'’] then h and h can be computed by the provers in EXP4lPol], 
We therefore find that Theorem 3.13 algebrizes as well: 


Theorem 3.14 For all A, A, EXP4P°l C MIP p. 
Theorem 3.14 has the following immediate corollary: 
Corollary 3.15 For all A, A, if EXPP] c PA/poly then EXP4IPo¥] C MAA, 


Proof. If EXP4!Po!y] c pA /poly, then an MAA verifier can guess two polynomial-size circuits, and 
use them to simulate the EXP4P°! provers in an MIP&yp protocol for EXP4P°), m 


3.4 Recent Circuit Lower Bounds: Algebrizing 


As mentioned earlier, Vinodchandran [41] showed that PP ¢ SIZE (n*) for all constants k, and 
Aaronson |1] showed that this result fails to relativize. However, by using Theorem 3.5, we can 
now show that Vinodchandran’s result algebrizes. 


Theorem 3.16 For all A,A and constants k, we have pp É SIZE4 (n*). 


Proof. If PP4 ¢ P4/poly then we are done, so assume pP4 c P4/poly. Then certainly PP c 
PA /poly, so Theorem 3.5 implies that p#P^ C MA. Therefore aan C MA^ as well, since 


Toda’s Theorem [39] (which relativizes) tells us that Lf C P*P and hence (£$ jF c p#P* But 
Kannan’s Theorem [23] (which also relativizes) tells us that ZË ¢ SIZE (n*) for fixed k, and hence 


E ¢ SIZE4 (n*). Therefore MA4 ¢ SIZE4 (n*). So since MA C PP and this inclusion 


relativizes, PP“ ¢ SIZE4 (n*) as well. m 
In a similar vein, Buhrman, Fortnow, and Thierauf [8] showed that MAgexp É P/poly, and also 
that this circuit lower bound fails to relativize. We now show that it algebrizes. 
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Theorem 3.17 For all A, A, we have MAĝ p P4 /poly. 


Proof. Suppose MAp c P4/poly C PA /poly. Then certainly pp4 E PA /poly as well, so 
Theorem 3.5 implies that p#P^ć C MA“. Hence we also have (Z5 e € MA4 by Toda’s Theorem 
[39], and hence oo. C MAp by padding. But Kannan’s Theorem [23] tells us that (Er) 4 (A 


P4/poly, so MAgyp ¢ P4/poly as well. m 
Finally, Santhanam [36] recently showed that PromiseMA ¢ SIZE (n*) for all constants k. Let 
us show that Santhanam’s result algebrizes as well. 


Theorem 3.18 For all A,A and constants k, we have PromiseMA4 ¢ SIZE4 (n*). 


Proof. First suppose pp4 C PA /poly. Then P#P* C MAA by Theorem 3.5. Hence (z5)^ G MAA 
by Toda’s Theorem [39], so by Kannan’s Theorem [23] we have MA4 ¢ SIZE4 (n*) and are done. 


Next suppose pp4 É pA /poly. Then there is some superpolynomial function s (not necessarily 
time-constructible) such that 


MAJFSAT® € SIZE4 (s (n)) \ SIZEA (s (n) — 1). 
We define a promise problem (Ling, Lyo) by padding M AJF SATA as follows: 


Lips i= foes :2E MAJFSAT, } : 


Pot C Pr. MAJFSAT, } . 


Our first claim is that (Lýgg, Lyo) ¢ SIZE4 (n*). For suppose otherwise; then by ignoring the 
padding, we would obtain circuits for M AJF SATA of size 


(n+ sm)? < s(n), 


contrary to assumption. 7 
Our second claim is that (Lyng, LNo) € PromiseMA“. This is because, on input esr 


PromiseMA“ machine can guess a circuit for M AJF SAT, A of size s(n), and then use Theorem 3.5 
to verify that it works. m 


3.5 Other Algebrizing Results 


Impagliazzo, Kabanets, and Wigderson [20] proved that NEXP c P/poly implies NEXP = MA. In 
the proof of this theorem, the only non-relativizing ingredient is the standard result that EXP C 
P/poly implies EXP = MA, which is algebrizing by Corollary 3.15. One can thereby show that the 
IKW theorem is algebrizing as well. More precisely, for all A, A we have 


NEXP4lPoly] c PA/poly => NEXP4IPo] c MA4. 
8Note that Santhanam originally proved his result using a “tight” variant of the IP = PSPACE theorem, due to 


Trevisan and Vadhan [40]. We instead use a tight variant of the LFKN theorem. However, we certainly expect that 
the Trevisan-Vadhan theorem, and the proof of Santhanam based on it, would algebrize as well. 
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Feige and Kilian [12] showed that RG = EXP, where RG is Refereed Games: informally, the 
class of languages L decidable by a probabilistic polynomial-time verifier that can interact (and 
exchange private messages) with two competing provers, one trying to convince the verifier that 
x € L and the other that x ¢ L. By analogy to IP = PSPACE and MIP = NEXP, one would expect 
this theorem to algebrize. And indeed it does, but it turns out to relativize as well! Intuitively, 
this is because the RG protocol of Feige and Kilian involves only multilinear extensions of Turing 
machine tableaus, and not arithmetization as used (for example) in the IP = PSPACE theorem. 
We omit the details. 


4 Lower Bounds on Algebraic Query Complexity 


What underlies our algebraic oracle separations is a new model of algebraic query complexity. 
In the standard query complexity model, an algorithm is trying to compute some property of a 
Boolean function A : {0,1}” — {0,1} by querying A on various points. In our model, the function 
A: {0,1}” — {0,1} will still be Boolean, but the algorithm will be allowed to query not just A, 
but also a low-degree extension A: F” — F of A over some field F. In this section we develop the 
algebraic query complexity model in its own right, and prove several lower bounds in this model. 
Then, in Section 5, we apply our lower bounds to prove algebraic oracle separations. Section 6 
will consider the variant where the algorithm can query an extension of A over the ring of integers. 

Throughout this section we let N = 2”. Algorithms will compute Boolean functions (properties) 
f :{0, i — {0,1}. An input A to f will be viewed interchangeably as an N-bit string A € {0, ita 
or as a Boolean function A : {0,1} — {0,1} of which the string is the truth table. 

Let us recall some standard query complexity measures. Given a Boolean function f : {0, ie = 
{0,1}, the deterministic query complexity of f, or D(f), is defined to be the minimum number 
of queries made by any deterministic algorithm that evaluates f on every input. Likewise, the 
(bounded-error) randomized query complexity R (f) is defined to be the minimum expected? number 
of queries made by any randomized algorithm that evaluates f with probability at least 2/3 on 
every input. The bounded-error quantum query complexity Q(f) is defined analogously, with 
quantum algorithms in place of randomized ones. See Buhrman and de Wolf [10] for a survey of 
these measures. 

We now define similar measures for algebraic query complexity. In our definition, an important 
parameter will be the multidegree of the allowed extension (recall that mdeg (p) is the largest degree 
of any of the variables of p). In all of our results, this parameter will be either 1 or 2. 


Definition 4.1 (Algebraic Query Complexity Over Fields) Let f : {0,1}. — {0,1} be a 
Boolean function, let F be any field, and let c be a positive integer. Also, let M be the set of 
deterministic algorithms M such that M4 outputs f (A) for every oracle A : {0,1}" — {0,1} and 
every finite field extension A: F” — F of A with mdeg(A) < c. Then the deterministic algebraic 
query complexity of f over F is defined as 






































Dr, f):= min max Tu (A), 
el ) MEM 4,4 : mdeg(A)<c ( ) 


where T (A) is the number of queries to A made by M A. The randomized and quantum alge- 
braic query complexities Rg, (f) and Qgc(f) are defined similarly, except with (bounded-error) 
randomized and quantum algorithms in place of deterministic ones. 


°Or the worst-case number of queries: up to the exact constant in the success probability, one can always ensure 
that this is about the same as the expected number. 
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4.1 Multilinear Polynomials 


The construction of “adversary polynomials” in our lower bound proofs will require some useful 
facts about multilinear polynomials. In particular, the basis of delta functions for these polynomials 
will come in handy. 

In what follows F is an arbitrary field (finite or infinite). Given a Boolean point z, define 


ô- (£) := II Ti II (1 — zi) 
izi=] i:z;=0 


to be the unique multilinear polynomial that is 1 at z and 0 elsewhere on the Boolean cube. Then 
for an arbitrary multilinear polynomial m : F” — F, we can write m uniquely in the basis of 6,’s 


as follows: 
m (x) = bp Mzôz (x) 
z€{0,1}” 














We will often identify a multilinear polynomial m with its coefficients m, in this basis. Note that 
for any Boolean point z, the value m (z) is simply the coefficient m, in the above representation. 


4.2 Lower Bounds by Direct Construction 


We now prove lower bounds on algebraic query complexity over fields. The goal will be to show that 
querying points outside the Boolean cube is useless if one wants to gain information about values 
on the Boolean cube. In full generality, this is of course false (as witnessed by interactive proofs and 
PCPs on the one hand, and by the result of Juma et al. [21] on the other). To make our adversary 
arguments work, it will be crucial to give ourselves sufficient freedom, by using polynomials of 
multidegree 2 rather than multilinear polynomials. 

We first prove deterministic lower bounds, which are quite simple, and then extend them to 
probabilistic lower bounds. Both work for the natural NP predicate of finding a Boolean point z 
such that A(z) = 1. 


4.2.1 Deterministic Lower Bounds 




















F”. Then there exists a multilinear 








Lemma 4.2 Let F be a field and let y1,...,y be points in 
polynomial m : F” — F such that 


(i) m (yi) =0 for alli € |t], and 
(ii) m(z) =1 for at least 2” — t Boolean points z. 


Proof. If we represent m as 


m (x) = `> Mzôz (£), 


z€{0,1}” 











then the constraint m (y;) = 0 for all i € [t] corresponds to t linear equations over F relating the 2” 
coefficients mz. By basic linear algebra, it follows that there must be a solution in which at least 
2” — t of the m,’s are set to 1, and hence m (z) = 1 for at least 2” — t Boolean points z. m 





Lemma 4.3 Let F be a field and let y1,...,y be points in F”. Then for at least 2” — t Boolean 
points w € {0,1}", there exists a multiquadratic extension polynomial p : EF” — F such that 
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(i) p (yi) =0 for alli € [t], 
(ii) p(w) =1, and 


(itt) p(z) =0 for all Boolean z 4 w. 














Proof. Let m : F” — F be the multilinear polynomial from Lemma 4.2, and pick any Boolean w 
such that m (w) = 1. Then a multiquadratic extension polynomial p satisfying properties (i)-(iii) 
can be obtained from m as follows: 


p(x) := m (£) ôw (£). 


m 

Given a Boolean function A: {0,1}" — {0,1}, let the OR problem be that of deciding whether 
there exists an x € {0,1}” such that A(x) = 1. Then Lemma 4.3 easily yields an exponential 
lower bound on the algebraic query complexity of the OR problem. 





Theorem 4.4 Dro (OR) = 2” for every field F. 











Proof. Let Y be the set of points queried by a deterministic algorithm, and suppose |)| < 2”. 
Then Lemma 4.3 implies that there exists a multiquadratic extension polynomial A : F” — F such 
that A(y) = 0 for all y € Y, but A(w) = 1 for some Boolean w. So even if the algorithm is 
adaptive, we can let Y be the set of points it queries assuming each query is answered with 0, and 
then find A, B such that A (y) = B (y) = 0 for all y € Y, but nevertheless A and B lead to different 
values of the OR function. m 

Again, the results of Juma et al. [21] imply that multidegree 2 is essential here, since for 
multilinear polynomials it is possible to solve the OR problem with only one query (over fields of 
characteristic greater than 2). 

Though Lemma 4.3 sufficed for the basic query complexity lower bound, our oracle separations 
will require a more general result. The following lemma generalizes Lemma 4.3 in three ways: it 
handles extensions over many fields simultaneously instead of just one field; it lets us fix the queried 
points to any desired values instead of just zero; and it lets us toggle the values on many Boolean 
points instead of just the single Boolean point w. 




















Lemma 4.5 Let F be a collection of fields (possibly with multiplicity). Let f : {0,1}" — {0,1} 
be a Boolean function, and for every E € F, let pp : F” — F be a multiquadratic polynomial over 
F extending f. Also let Vp C F” for each F € C, and t := \opec |Ye|. Then there exists a subset 
B C {0,1}", with |B| < t, such that for all Boolean functions f' : {0,1}" — {0,1} that agree with 
f on B, there exist multiquadratic polynomials py : F” — F (one for each F € F) such that 
























































(i) pẹ extends f', and 


(ii) py (Y) = pr (y) for all y € Yp. 











Proof. Call a Boolean point z good if for every F € F, there exists a multiquadratic polynomial 
ur,z : F” — F such that 























(°) ur (y) = 0 for all y € Yr, 


(ii’) up,z (z) = 1, and 
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(iii’) up,z (w) = 0 for all Boolean w Æ z. 














Then by Lemma 4.3, each F € F can prevent at most |Yg| points from being good. Hence 
there are at least 2” — t good points. 

Now let G be the set of all good points, and B = {0,1}" \ G be the set of all “bad” points. 
Then for all F € F, we can obtain a polynomial p satisfying (i) and (ii) as follows: 


pp (x) := pr (£) + X (F (2) — f (2)) ur,- (2). 


zEG 


4.2.2 Probabilistic Lower Bounds 


We now prove a lower bound for randomized algorithms. As usual, this will be done via the 
Yao minimax principle, namely by constructing a distribution over oracles which is hard for every 
deterministic algorithm that queries few points. Results in this subsection are only for finite fields, 
the reason being that they allow a uniform distribution over sets of all polynomials with given 
restrictions. 


Lemma 4.6 Let F be a finite field. Also, for all w € {0,1}", let Dy be the uniform distribution 
over multiquadratic polynomials p : F” — F such that p(w) = 1 and p(z) = 0 for all Boolean 
z +w. Suppose an adversary chooses a “marked point” w € {0,1}” uniformly at random, and 
then chooses p according to Dy. Then any deterministic algorithm, after making t queries to p, 
will have queried w with probability at most t/2”. 














Proof. Let y; € F” be the it” point queried, so that y1, ...,y: is the list of points queried by step 
t. Then as in Lemma 4.5, call a Boolean point z good if there exists a multiquadratic polynomial 
u : F” — F such that 




















(i) u (yi) = 0 for all z € ft], 
(ii) u (z) = 1, and 
(iii) u(z’) = 0 for all Boolean 7’ # z. 


Otherwise call z bad. Let G; be the set of good points immediately after the tt? step, and let 
Bı = {0,1}" \G; be the set of bad points. Then it follows from Lemma 4.3 that |G;| > 2” — t, 
and correspondingly |B| < t. Also notice that B; C By4, for all t. 

For every good point z € {0,1}", fix a “canonical” multiquadratic polynomial u; that satisfies 
properties (i)-(iii) above. Also, for every Boolean point z, let V, be the set of multiquadratic 
polynomials v : F” — F such that 


(i) v (yi) =p (yi) for all i € [t], 
G) v (z) = 1, and 
(iii) v (z’) = 0 for all Boolean 7’ # z. 


Now let x, 2’ € G; be any two good points. 


20 


Claim 4.7 Even conditioned on the values of p(yi),.-.-,p (yz), the probability that p(x) = 1 is 
equal to the probability that p (x') = 1. 


To prove Claim 4.7, it suffices to show that |V,,| = |V,-|.. We will do so by exhibiting a one-to-one 
correspondence between Vz and Vy. Our correspondence is simply the following: 


v E Vz 4> ut Ug — Ug E€ Vx. 


Now imagine that at every step i, all points in B; are automatically queried “free of charge.” 
This assumption can only help the algorithm, and hence make our lower bound stronger. 


Claim 4.8 Suppose that by step t, the marked point w still has not been queried. Then the 
probability that w is queried in step t+ 1 is at most 


|Bi+ı| — |Bil 
— |B, 


To prove Claim 4.8, notice that after t steps, there are 2” — |B,| points still in G;—and by 
Claim 4.7, any of those points is as likely to be w as any other. Furthermore, there are at most 
|Br41| — |B,| points queried in step t + 1 query that were not queried previously. For there are 
| Br41|—|B,| points in By4,\B; that are queried “free of charge,” plus one point y,+1 that is queried 
explicitly by the algorithm. Naively this would give |B,,1| — |B;| + 1, but notice further that if 
yt+1 is Boolean, then y41 € Byy1. 

Now, the probability that the marked point was not queried in steps 1 through t is just 1 — 
|B,| /2”. Therefore, the total probability of having queried w after t steps is 


t—1 
IBil\ /lBal — [Bil Peal t 
is E Mae <x. 
DE ( Qn Qn |B; -5 Bale = gn 


i=0 





E 
An immediate corollary of Lemma 4.6 is that, over a finite field, randomized algebraic query 
algorithms do no better than deterministic ones at evaluating the OR function. 





Theorem 4.9 Rro (OR) = Q (2") for every finite field F. 











To give an algebraic oracle separation between NP and BPP, we will actually need a slight 
extension of Lemma 4.6, which can be proven similarly to Lemma 4.5 (we omit the details). 








Lemma 4.10 Given a finite field F and string w € {0,1}", let Dup be the uniform distribution 
over multiquadratic polynomials p : F” — F such that p(w) = 1 and p (z) = 0 for all Boolean z # w. 
Suppose an adversary chooses w € {0,1}" uniformly at random, and then for every finite field F, 
chooses pp according to Dwr. Then any algorithm, after making t queries to any combination of 
pr’s, will have queried w with probability at most t/2”. 











4.3 Lower Bounds by Communication Complexity 


In this section we point out a simple connection between algebraic query complexity and com- 
munication complexity. Specifically, we show that algebraic query algorithms can be efficiently 
simulated by Boolean communication protocols. This connection will allow us to derive many 
lower bounds on algebraic query complexity that we do not know how to prove with the direct 
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techniques of the previous section. Furthermore, it will give lower bounds even for multilinear 
extensions, and even for extensions over the integers. The drawbacks are that (1) the functions 
for which we obtain the lower bounds are somewhat more complicated (for example, Disjointness 
instead of OR), and (2) this technique does not seem useful for proving algebraic oracle collapses 
(such as NP4 C SIZE“ (n)). 

For concreteness, we first state our “transfer principle” for deterministic query and communi- 
cation complexities—but as we will see, the principle is much broader. 





Theorem 4.11 Let A: {0,1}" — {0,1} be a Boolean function, and let A: Fi — Fy be the unique 
multilinear extension of A over a finite field F. Suppose one can evaluate some Boolean predicate 
f of A using T deterministic adaptive queries to A. Also, let Ag and A, be the subfunctions of A 
obtained by restricting the first bit to O or 1 respectively. Then if Alice is given the truth table of 
Ag and Bob is given the truth table of Ay, they can jointly evaluate f (A) using O (Tnlog |F|) bits 
of communication. 

















Proof. Given any point y € F”, we can write A (y) as a linear combination of the values taken by 
A on the Boolean cube, like so: 


Atyy= X 6 (y) A(z). 


z€{0,1}" 


Now let M be an algorithm that evaluates f using T queries to A. Our communication protocol 
will simply perform a step-by-step simulation of M, as follows. 
Let yı € F” be the first point queried by M. Then Alice computes the partial sum 


Ao (m) = $. 602 (y) A (Oz) 


ZELO Jen 


and sends (y1, Ao (y1)) to Bob. Next Bob computes 


u= P biz (y) A(t), 


z€{0,1}"—} 


from which he learns A (y1) = Ao (y1) + Ai (y1)- Bob can then determine y2, the second point 
queried by M given that the first query had outcome A (y1). So next Bob computes Aj (y2) and 
sends (y2, Ar (y2)) to Alice. Next Alice computes A (y2) = Ao (y2) + Ai (y2), determines y3, and 
sends (y3, Ao (y3)) to Bob, and so on for T rounds. 

Each message uses O (n log |F|) bits, from which it follows that the total communication cost is 
O(Tnlog|F|). m 

In proving Theorem 4.11, notice that we never needed the assumption that M was deterministic. 
Had M been randomized, our simulation would have produced a randomized protocol; had M been 
quantum, it would have produced a quantum protocol; had M been an MA machine, it would have 
produced an MA protocol, and so on. 

To illustrate the power of Theorem 4.11, let us now prove a lower bound on algebraic query 
complexity without using anything about polynomials. 

Given two Boolean strings x = xı ... £y and y = yı... yn, recall that the Disjointness problem 
is to decide whether there exists an index i € [N] such that x; = y; = 1. Supposing that Alice 
holds x and Bob holds y, Kalyasundaram and Schnitger [22] showed that any randomized protocol 
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to solve this problem requires Alice and Bob to exchange Q (N) bits (see also the simpler proof by 
Razborov [33]). 

In our setting, the problem becomes the following: given a Boolean function A : {0,1}" — {0,1}, 
decide whether there exists an x € {0,1}""' such that A (0x) = A (1x) = 1. Call this problem 
DISJ, and suppose we want to solve DISJ using a randomized algorithm that queries the multilinear 
extension A: F” — F of A. Then Theorem 4.11 immediately yields a lower bound on the number 
of queries to A that we need: 





Theorem 4.12 Rra (DISJ) =Q (zm) for all finite fields 








=] 





Proof. Suppose Rei (DISJ) =o (<n): Then by Theorem 4.11, we get a randomized protocol 


for the Disjointness problem with communication cost o (N), where N = 2”—!. But this contradicts 
the lower bound of Razborov [33] and Kalyasundaram and Schnitger [22] mentioned above. m 

In Section 5, we will use the transfer principle to convert many known communication complexity 
results into algebraic oracle separations. 


5 The Need for Non-Algebrizing Techniques 


In this section we show formally that solving many of the open problems in complexity theory will 
require non-algebrizing techniques. We have already done much of the work in Section 4, by proving 
lower bounds on algebraic query complexity. What remains is to combine these query complexity 
results with diagonalization or forcing arguments, in order to achieve the oracle separations and 
collapses we want. 


5.1 Non-Algebrizing Techniques Needed for P vs. NP 


We start with an easy but fundamental result: that any proof of P Æ NP will require non-algebrizing 
techniques. 


Theorem 5.1 There exist A,A such that NPA C PA, 


Proof. Let A be any PSPACE-complete language, and let A be the unique multilinear extension 
of A. As observed by Babai, Fortnow, and Lund [4], the multilinear extension of any PSPACE 
language is also in PSPACE. So as in the usual argument of Baker, Gill, and Solovay [5], we have 
NP4 = NPPSPACE _ PSPACE = P4. m 

The same argument immediately implies that any proof of P Æ PSPACE will require non- 
algebrizing techniques: 


Theorem 5.2 There exist A,A such that PSPACE4Po) — p4, 


Next we show that any proof of P = NP would require non-algebrizing techniques, by giving an 
algebraic oracle separation between P and NP. As in the original work of Baker, Gill, and Solovay 
[5], this direction is the harder of the two. 


Theorem 5.3 There exist A,A such that NP“ É pA, Furthermore, the language L that achieves 
the separation simply corresponds to deciding, on inputs of length n, whether there exists a w € 


{0,1}" with An (w) = 1. 
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Proof. Our proof closely follows the usual diagonalization argument of Baker, Gill, and Solovay [5], 
except that we have to use Lemma 4.5 to handle the fact that P can query a low-degree extension. 

For every n, the oracle A will contain a Boolean function A, : {0,1}" — {0,1}, while A will 
contain a multiquadratic extension Ap p : F” — F of A, for every n and finite field F. Let L 
be the unary language consisting of all strings 1” for which there exists a w € {0,1}” such that 
A, (w) =1. Then clearly L € NP“ for all A. Our goal is to choose A, A so that L g PA, 

Let Mı, Mo,... be an enumeration of DTIME (nies) oracle machines. Also, let M; (n) = 1 
if M; accepts on input 1” and M; (n) = 0 otherwise, and let L(n) = 1 if 1” € L and L(n) = 0 
otherwise. Then it suffices to ensure that for every i, there exists an n such that M; (n) Æ L (n). 

The construction of A proceeds in stages. At stage i, we assume that L(1),...,L(i—1) are 
already fixed, and that for each j < i, we have already found an n; such that M; (nj) #4 L (nj). Let 
Sj be the set of all indices n such that some An is queried by M; on input 1%. Let T; := U jci Si- 




















Then for all n € T;, we consider every ÅnF to be “fixed”: that is, it will not change in stage i or 
any later stage. 

Let n; be the least n such that n ¢ T; and 2” > n°8”, Then simulate the machine M; on input 
1”, with the oracle behaving as follows: 


(i) If M; queries some AnF (y) with n € T;, return the value that was fixed in a previous stage. 
(ii) If M; queries some AnF (y) with n ¢ T;, return 0. 


Once M; halts, let S; be the set of all n such that M; queried some AnF- Then for all n € S;\T; 
other than n;, and all F, fix AnF := 0 to be the identically-zero polynomial. As for n; itself, there 
are two cases. If M; accepted on input 1”, then fix Ani F := 0 for all F, so that L(n;) =0. On 
the other hand, if M} rejected, then for all F, let Yp be the set of all y € F"! that M; queried. We 
have Xp |Yr| < n°8”. So by Lemma 4.5, there exists a Boolean point w € {0,1} such that for 
all F, we can fix An; F : F": — F to be a multiquadratic polynomial such that 
























































(?) Anr (Y) =0 for all y € Yr, 
Gi) An, (w) = 1, and 


(Gii) An, r (w) =0 for all Boolean z £ w. 

We then have L (n;) = 1, as desired. m 

In the proof of Theorem 5.3, if we simply replace 2” > n by the stronger condition 2”7t > 
nies” then an RP algorithm can replace the NP one. Thus, we immediately get the stronger result 
that there exist A, A such that RP ¢ P4. Indeed, by interleaving oracles such that RP4 ¢ PA 
and coRP4 ¢ P4, it is also possible to construct A, Å such that ZPP4 ¢ P4 (we omit the details). 


logn 


5.2 Non-Algebrizing Techniques Needed for NP vs. BPP 


We now show an algebraic oracle separation between NP and BPP. This result implies that any 
proof of NP C BPP would require non-algebrizing techniques—or to put it more concretely, there 
is no way to solve 3SAT in probabilistic polynomial time, by first arithmetizing a 3S AT formula 
and then treating the result as an arbitrary low-degree black-box polynomial. 


Theorem 5.4 There exist A, A such that NP4 É BPPA, Furthermore, the language L that 
achieves the separation simply corresponds to finding a w € {0,1}" with An (w) = 1. 
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Proof. Our proof closely follows the proof of Bennett and Gill [6] that P4 4 NP4 with probability 
1 over A. 

Similarly to Lemma 4.10, given a Boolean point w and a finite field F, let Dn w,r be the uniform 
distribution over all multiquadratic polynomials p : F” — F such that p(w) = 1 and p(z) = 0 for 
all Boolean z 4 w. Then we generate the oracle A according to following distribution. For each 
n EN, first draw wn € {0,1}” uniformly at random, and set An (wn) = 1 and A, (z) = 0 for all 
n-bit Boolean strings z 4 wn. Next, for every finite field F, draw the extension AnF of An from 
Dn wn, F- 

We define the language L as follows: 0'1"~* € L if and only if the it? bit of wn is a 1, and x ¢ L 
for all x not of the form 01”~Ż. Clearly L € NP“. Our goal is to show that L ¢ BPP“ with 
probability 1 over the choice of A. 

Fix a BPP oracle machine M. Then let Emn, be the event that M correctly decides whether 
0'1"-* € L, with probability at least 2/3 over M’s internal randomness, and let 


























Eun = Emin, Nae EM,n,n- 


Supposing Em,n holds, with high probability we can recover wp in polynomial time, by simply 
running M several times on each input 0°1”~* and then outputting the majority answer as the i” 
bit of wn. But Lemma 4.10 implies that after making t queries, we can guess wp with probability 
at most 

tyd 

gn ä m--g 
just as if we had oracle access only to An and not to the extensions Än F. 

So given n, choose another input size n' > n which is so large that on inputs of size n or less, 

M cannot have queried A, p for any F (for example, n’ = 2?" will work for sufficiently large n). 
Then for all sufficiently large n, we must have 














i [Emn | Em None A Emyn| < 


w| = 


This implies that 





Pr [Em1 A Eu A ++] = 0. 
A 
But since there is only a countable infinity of BPP machines, by the union bound we get 
Pr [SM : Em1 A Em2A--:] =0 
A 


which is what we wanted to show. m 
Theorem 5.4 readily extends to show any proof of NP C P/poly would require non-algebrizing 
techniques: 


Theorem 5.5 There exist A, A such that NP4 vA PA /poly. 


Proof Sketch. Suppose we have a pA /poly machine that decides a language L € NP4 using an 


k Then by guessing the advice string, we get a BPP“ machine that decides 


advice string of size n 
L on all inputs with probability Q(2-”*). We can then run the BPP“ machine sequentially on 
(say) n?* inputs a1,...,2,2*, and decide all of them with a greater probability than is allowed by 


the proof of Theorem 5.4.10 m 
10 Because of the requirement that the BPP“ machine operates sequentially—i.e., that it outputs the answer for 


each input x; before seeing the next input x:+1— there is no need here for a direct product theorem. On the other 
hand, proving direct product theorems for algebraic query complexity is an interesting open problem. 
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5.3 Non-Algebrizing Techniques Needed for Circuit Lower Bounds 


We now give an oracle A and extension A such that NEXP4 c P4 /poly. This implies that any 
proof of NEXP ¢ P/poly will require non-algebrizing techniques. 


Theorem 5.6 There exist A, À such that NTIME4 (2”) C SIZE4 (n). 


Proof. Let M1, Mo,... be an enumeration of NTIME (2”) oracle machines. Then on inputs of size 
n, it suffices to simulate M1, ..., Mn, since then every M; will be simulated on all but finitely many 
input lengths. 

For simplicity, we will assume that on inputs of size n, the M;’s can query only a single poly- 
nomial, p : F — F. Later we will generalize to the case where the M;’s can query An p for every 
n and F simultaneously. 

We construct p by an iterative process. We are dealing with n2” pairs of the form (i, x}, where 
x € {0,1}" is an input and 7 € [n] is the label of a machine. At every iteration, each (i, x) will be 
either satisfied or unsatisfied, and each point in F4” will be either active or inactive. Initially all 
(i,z)’s are unsatisfied and all points are active. 

To fix an active point y will mean we fix the value of p(y) to some constant cy, and switch y 
from active to inactive. Once y is inactive, it never again becomes active, and p(y) never again 
changes. 

We say that y is fixed consistently, if after it is fixed there still exists a multiquadratic extension 
polynomial p : F” — F such that p(y) = Cy for all inactive points y. Then the iterative process 
consists of repeatedly asking the following question: 

Does there exist an unsatisfied (i,x), such that by consistently fixing at most 2” active points, 
we can force M; to accept on input x? 

If the answer is yes, then we fix those points, switch (7,2) from unsatisfied to satisfied, and 
repeat. We stop only when we can no longer find another (i, x) to satisfy. 

Let D be the set of inactive points when this process halts. Then |D| < n2?”. So by Lemma 
4.5, there exists a subset G C {0,1}*”, with |G] > 24" — n22”, such that for any Boolean function 
f :{0, 1 — {0,1}, there exists a multiquadratic polynomial p : F” — F satisfying 












































(i) p(y) = c for all y € D, 
(ii) p(z) = f (z) for all z € G, and 


(iii) p(z) € {0,1} for all Boolean z. 


To every machine-input pair (i, x}, associate a unique string Wis € {0, 1} in some arbitrary 
way. Then for all (i, x} we have 


n2?” 
Pr [z@ wie EG]>1- are 


ze€{0,1}*” 





So by the union bound, there exists a fixed string 2’ € {0,1}*" such that 2’ ® wim € G for all (i, x). 

We will choose the Boolean function f so that for every (i, x) pair, f (z’ ® Wix) encodes whether or 

not M; accepts on input z. Note that doing so cannot cause any additional (i, x} pairs to accept, 

for if it could, then we would have already forced those pairs to accept during the iterative process. 
Our linear-size circuit for simulating the M;’s will now just hardwire the string 2’. 


26 


Finally, let us generalize to the case where the M;’s can query AnF for any input length n and 
finite field F of their choice. This requires only a small change to the original proof. We construct 
A in stages. At stage n, assume that Aj p,...,An—i,F have already been fixed for every F. Then 






































our goal is to fix AnF for every F. Let Ye be the set of points in F” for which the value of AnF 
was fixed in one of the previous n — 1 stages. Then 


n—-1 


So [vel < $5 m2?” < 02", 
F 


m=1 


So by Lemma 4.5, for all F we can find multiquadratic polynomials An : F4” _ F that satisfy all 
the forcing conditions, and that also encode in some secret location whether M; accepts on input 
x for all i € [n] and x € {0,1}". m 

By a standard padding argument, Theorem 5.6 immediately gives A,A such that NEXP4 c 
P4/poly. This collapse is almost the best possible, since Theorem 3.17 implies that there do not 


exist A,A such that MAĝ p c P4/poly. 


Wilson [43] gave an oracle A relative to which EXPNPÎ c pA /poly. Using similar ideas, one 
can straightforwardly generalize the construction of Theorem 5.6 to obtain the following: 


Theorem 5.7 There exist A, À such that EXPNP* c PA /poly. 
One can also combine the ideas of Theorem 5.6 with those of Theorem 5.4 to obtain the following: 
Theorem 5.8 There exist A,A such that BPEXP4 c P4/poly. 


We omit the details of the above two constructions. However, we would like to mention one 
interesting implication of Theorem 5.8. Fortnow and Klivans [14] recently showed the following: 


Theorem 5.9 ((14]) If the class of polynomial-size circuits is exactly learnable by a BPP machine 
from membership and equivalence queries, or is PAC-learnable by a BPP machine with respect to 
the uniform distribution, then BPEXP ¢ P/poly. 


By combining Theorem 5.9 with Theorem 5.8, we immediately get the following corollary: 
Corollary 5.10 There exist A,A such, that P4 /poly circuits are not exactly learnable from mem- 
bership and equivalence queries (nor PAC-learnable with respect to the uniform distribution), even 


if the learner is a BPP machine with oracle access to A. 


Informally, Corollary 5.10 says that learning polynomial-size circuits would necessarily require 
non-algebrizing techniques. 


5.4 Non-Algebrizing Techniques Needed for Other Problems 


We can use the communication complexity transfer principle from Section 4.3 to achieve many 
other separations. 


Theorem 5.11 There exist A,A such that 
(i) NP4 ¢ BPPA, 
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(ii) coNP4 ¢ MAA, 
(iii) PNP* ¢ PPA, 
(iv) NP4 ¢ BPA, 
(v) BQP4 ¢ BPPA, and 
(vi) QMA4 ¢ MAA, 
Furthermore, for all of these separations A is simply the multilinear extension of A. 


Proof Sketch. Let us first explain the general idea, before applying it to prove these separations. 
Given a complexity class C, let Cec be the communication complexity analogue of C: that is, the 
class of communication predicates f : {0,1} x {0,1}% — {0,1} that are decidable by a C machine 
using O (polylog N) communication. Also suppose C^ C D^ for all oracles A and multilinear 
extensions A of A. Then the transfer principle (Theorem 4.11) would imply that Cec C Dec- Thus, 


if we know already that Ce É Dec, we can use that to conclude that there exist A, A such that 
C4 g D4, 
We now apply this idea to prove the six separations listed above. 


(i) Recall that Kalyasundaram and Schnitger [22] (see also [33]) proved an 2 (N) lower bound on 
the randomized communication complexity of the Disjointness predicate. From this, together 
with a standard diagonalization argument, one easily gets that NPcc ¢ BPPcc. Hence there 


exist A, A such that NP4 ¢ BPP4. 


(ii) Klauck [25] has generalized the lower bound of [33, 22] to show that Disjointness has MA 
communication complexity Q(/N ). From this it follows that coNPe. ¢ MA«, and hence 


coNP4 ¢ MAA. 
(iii) Buhrman, Vereshchagin, and de Wolf [9] showed that PNP ¢ PP.., which implies pNp* É ppA, 


(iv) Razborov [34] showed that Disjointness has quantum communication complexity Q(/N). 
This implies that NPec ¢ BQPcc, and hence NP4 ¢ BQP4.!! 


(v) Raz [30] gave an exponential separation between randomized and quantum communication 
complexities for a promise problem. This implies that PromiseBQP,, É PromiseBPP¢c, and 


hence BQP4 É BPp4 (note that we can remove the promise by simply choosing oracles A, A 
that satisfy it). 


(vi) Raz and Shpilka [31] showed that PromiseQMA.. ¢ PromiseMA;c. As in (iv), this implies 
that QMA4 ¢ MA4. 


We end by mentioning, without details, two other algebraic oracle separations that can be 
proved using the connection to communication complexity. 


Tet us remark that, to our knowledge, this reduction constitutes the first use of quantum communication com- 
plexity to obtain a new lower bound on quantum query complexity. The general technique might be applicable to 
other problems in quantum lower bounds. 
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First, Andy Drucker (personal communication) has found A, A such that NP4 ¢ PCP“, thus 
giving a sense in which “the PCP Theorem is non-algebrizing.” Here PCP is defined similarly to 
MA, except that the verifier is only allowed to examine O (1) bits of the witness. Drucker proves this 
result by lower-bounding the “PCP communication complexity” of the Non-Disjointness predicate. 
In particular, if Alice and Bob are given a PCP of m bits, of which they can examine at most c, 
then verifying Non-Disjointness requires at least N/ m?( bits of communication. It remains open 
what happens when m is large compared to N. 7 

Second, Hartmut Klauck (personal communication) has found A, A such that coNP4 ¢ QMA4, 
by proving an Q(N!/3) lower bound on the QMA communication complexity of the Disjointness 
predicate.!? 


6 The Integers Case 


For simplicity, thus far in the paper we restricted ourselves to low-degree extensions over fields 
(typically, finite fields). We now consider the case of low-degree extensions over the integers. 
When we do this, one complication is that we can no longer use Gaussian elimination to construct 
“adversary polynomials” with desired properties. A second complication is that we now need to 
worry about the size of an extension oracle’s inputs and outputs (i.e., the number of bits needed to 
specify them). For both of these reasons, proving algebraic oracle separations is sometimes much 
harder in the integers case than in the finite field case. 
Formally, given a vector of integers v = (v1,...,Un), we define the size of v, 


n 


size (v) := D [logs (Jvi] + 2)], 


i=1 


to be a rough measure of the number of bits needed to specify v. Notice that size (v) > n for all v. 
We can now give the counterpart of Definition 2.2 for integer extensions: 


Definition 6.1 (Extension Oracle Over The Integers) Let Am : {0,1}'" — {0,1} be a Boolean 
function. Then an extension of Am over the integers Z is a polynomial Am : Z™ — Z such that 
Am (x) = Am (x) whenever x € {0,1}. Also, given an oracle A = (Am), an extension A of A is 
a collection of polynomials Ay : Z™ — Z, one for each m E N, such that 


(i) Am is an extension of Am for all m, 


a~ 


(ii) there exists a constant c such that mdeg( Am) < c for all m, and 


n 


(iit) there exists a polynomial p such that size(Am (x)) < p (m + size (x)) for all x € Z™. 


Then given a complexity class C, by C4 or C4lP¥l we mean the class of languages decidable 

by a C machine that, on inputs of length n, can query Am for any m or any m = O (poly (n)) 
respectively. 

12Tt is an interesting question whether his lower bound is tight. We know that Disjointness admits a quantum 

protocol with O(V N) communication [7, 2], as well as an MA-protocol with O(VN log N) communication (see Sec- 


tion 7.2). The question is whether these can be combined somehow to get a QMA-protocol with, say, O(N'3) 
communication. 
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Notice that integer extensions can always be used $ simulate finite field extensions—since 
given an integer Âm (x), together with a field F of order q! where q is prime, an algorithm can just 
compute Aj F(x) := Am (x) modq for itself. In other words, for every integer extension A, there 
exists a finite field extension A such that D4 G DÂ for all complexity classes D capable of modular 
arithmetic. Hence any result of the form C4 C DA for all A, A automatically implies CAT DÂ for 
all A, A. Likewise, any construction of oracles A, A such that C4 É DA automatically implies the 
existence of A, A such that C4 É DA, 

We now define the model of algebraic query complexity over the integers. 


Definition 6.2 (Algebraic Query Complexity Over Z) Let f : {0,1} — {0,1} be a Boolean 
function, and let s and c be positive integers. Also, let M be the set of deterministic algorithms 
M such that for every oracle A : {0,1}" — {0,1}, and every integer extension A:Z2°3Z of A 
with mdeg(A) < c, 


(i) MA outputs f (A), and 
(ii) every query x made by MA satisfies size (x) < s. 


Then the deterministic algebraic query complexity of f over Z is defined as 


D f):= min max Tu A), 
sel ) MEM A, A:mdeg(A)<c ( ) 


where Tu(Â) is the number of queries to A made by MA. (For the purposes of this definition, 
we do not impose any upper bound on size(A (x)).) The randomized and quantum algebraic query 
complexities Rec (f) and Q,.(f) are defined similarly, except with (bounded-error) randomized and 
quantum algorithms in place of deterministic ones. 


Notice that proving lower bounds on Deg Recs and Qsc becomes harder as s increases, and 
easier as c increases. 

Our goal is twofold: (1) to prove lower bounds on the above-defined query complexity measures, 
and (2) to use those lower bounds to prove algebraic oracle separations over the integers (for 


example, that there exist A, A such that NP4 É Pâ), 


6.1 Lower Bounds by Communication Complexity 


A first happy observation is that every lower bound or oracle separation proved using Theorem 4.11 
(the communication complexity transfer principle) automatically carries over to the integers case. 
This is so because of the following direct analogue of Theorem 4.11 for integer extensions: 


Theorem 6.3 Let A: {0,1}" — {0,1} be a Boolean function, and let A: Z” >Z be the unique 
multilinear extension of A over Z. Suppose one can evaluate some Boolean predicate f of A using 
T deterministic adaptive queries to A, where each query x € Z” satisfies size (x) < s. Also, let Ao 
and A, be the subfunctions of A obtained by restricting the first bit to O or 1 respectively. Then if 
Alice is given the truth table of Ag and Bob is given the truth table of A,, they can jointly evaluate 
f(A) using O (Ts) bits of communication. 


The proof of Theorem 6.3 is essentially the same as the proof of Theorem 4.11, and is therefore 
omitted. 

By analogy to Theorem 4.12, Theorem 6.3 has the following immediate consequence for the 
randomized query complexity of Disjointness over the integers: 
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Theorem 6.4 Rei (DISJ) = Q (2"/s) for all s. 


Proof. Suppose Rea (DISJ) = 0(2"/s). Then by Theorem 6.3, we get a randomized protocol for 
Disjointness with communication cost o (2"), thereby violating the lower bound of Razborov [33] 
and Kalyasundaram and Schnitger [22]. m p 

One can also use Theorem 6.3 to construct oracles A and integer extensions A such that 


e NPA ¢ PA, 
e RPA ¢ PA, 
e NP4 ¢ BaP, 


and so on for all the other oracle separations obtained in Section 5.4 in the finite field case. 
The proofs are similar to those in Section 5.4 and are therefore omitted. 


6.2 Lower Bounds by Direct Construction 


Unlike with the communication complexity arguments, when we try to port the direct construction 
arguments of Section 4.2 to the integers case we encounter serious new difficulties. The basic 
source of the difficulties is that the integers are not a field but a ring, and thus we can no longer 
construct multilinear polynomials by simply solving linear equations. 

In this section, we partly overcome this problem by using some tools from elementary number 
theory, such as Chinese remaindering and Hensel lifting. The end result will be an exponential 
lower bound on Ds 2 (OR): the number of queries to a multiquadratic integer extension A : Z” — Z 
needed to decide whether there exists an x € {0,1}" with A(x) = 1, assuming the queries are 
deterministic and have size at most s < 2”. 

Unfortunately, even after we achieve this result, we will still not be able to use it to prove oracle 
separations like NP4 ¢ P4. The reason is technical, and has to do with size(A (x)): the number 
of bits needed to specify an output of A. In our adversary construction, size(A (x)) will grow like 
O (size(a) + ts), where t is the number of queries made by the algorithm we are fighting against 
and s is the maximum size of those queries. The dependence on size(z) is fine, but the dependence 
on t and s is a problem for two reasons. First, the number of bits needed to store A’s output might 
exceed the running time of the algorithm that calls A! Second, we ultimately want to diagonalize 
against all polynomial-time Turing machines, and this will imply that size(A (x)) must grow faster 
than polynomial. 

Nevertheless, both because we hope it will lead to stronger results, and because the proof is 
mathematically interesting, we now present a lower bound on Ds 2 (OR). 

Our goal is to arrive at a lemma similar to Lemma 4.3 in the field case; its analogue will be 
Lemma 6.9 below. 


Lemma 6.5 Let y1,...,y¢ be points in Z” and let q be a prime. Then there exists a multilinear 
polynomial hg: Z” — Z such that 


(i) hq (yi) = 0 (mod q) for alli € [t], and 


(ii) hq (z) =1 for at least 2” — t Boolean points z. 


(Note that hg could be non-Boolean on the remaining Boolean points.) 
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Proof. Let N = 2”; then we can label the N Boolean points z1,...,zy. For all i € [N], let 6; be 
the unique multilinear polynomial satisfying 6; (z;) = 1 and 6; (z;) = 0 for all j Æ i. 

Now let A be a (t+ N) x N integer matrix whose top t rows are labeled by yj,...,y:, whose 
bottom N rows are labeled by 2,..., zy, and whose columns are labeled by ô1,..., ôn. The (x, 6;) 
entry is equal to 6; (x). We assume without loss of generality that the top t x N submatrix of A 
has full rank mod q, for if it does not, then we simply remove rows until it does. Notice that the 
bottom N x N submatrix of A is just the identity matrix I. 

Now remove t of the bottom N rows, in such a way that the resulting N x N submatrix B of 
A is nonsingular mod q. Then for every vector v € aad , the system Ba = v(modq) is solvable 


























for a € FX. So choose v to contain 0’s in the first t coordinates and 1’s in the remaining N — t 
coordinates; then solve to obtain a vector a = (aj,...,ay). Finally, reinterpret the a;’s as integers 
from 0 to q — 1 rather than elements of F}, and set the polynomial hq to be 


N 
hq (£) := 5 aii (£). 
i=1 


It is clear that hg so defined satisfies property (i). To see that it satisfies (ii), notice that the last 
N —t rows of B are unit vectors. Hence, even over Fy, any solution to the system Ba = v (mod q) 
must set 441 =-:::-=an=l1. E 

We wish to generalize Lemma 6.5 to the case where the modulus q is not necessarily prime. To 
do so, we will need two standard number theory facts, which we prove for completeness. 


Proposition 6.6 (Hensel Lifting) Let B be an NxN integer matrix, and suppose B is invertible 
mod q for some prime q. Then the system Ba = v(modq°) has a solution in a € ZN for every 
v E€ ZN ande EN. 


Proof. By induction on e. When e = 1 the proposition obviously holds, so assume it holds for e. 
Then there exists a solution a to Ba = v (mod 4°), meaning that Ba — v = q°c for some c € ZN. 
From this we want to construct a solution a’ to Ba’ = v (mod eee Our solution will have the 
form a’ = a + q°8 for some B € ZN. To find £, notice that 


Bal = B(a+4°8) 


= Ba+q°BG 
=v+¢c+¢BB 
=v +q (e+ BB). 





Thus, it suffices to find a 8 such that BB = —c(modgq). Since B is invertible mod q, such a 3 
exists. W 


Proposition 6.7 (Chinese Remaindering) Let K and L be relatively prime. Then there exist 
integers a,b € [KL] such that: 


(i) For all x,y,z, the congruence z = ax + by (mod KL) holds if and only if z = xz (mod K) and 
z = y (mod L). 


(ii) If x =y = 1, then ax + by = KL +1 as an integer. 
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Proof. Let K’ and L’ be integers in [KL] such that K’ = K~! (mod L) and L’ = L~! (mod K); 
note that these exist since K and L are relatively prime. Then we simply need to set a := LL’ 
and b:= KK’. m 

We can now prove the promised generalization of Lemma 6.5. 


Lemma 6.8 Let yi,...,y; be points in Z”, let Q be an integer, and let Q = qj'--- 6m be its prime 
factorization. Then there exists a multilinear polynomial hg : Z” — Z such that 


(i) he (yi) = 0 (mod Q) for alli € [t], and 
(ii) ho (z) = 1 for at least 2” — mt Boolean points z. 
Proof. Say that a multilinear polynomial h : Z” — Z is (K,r)-satisfactory if 
(i?) h (yi) = 0 (mod K) for all ¿ € [t], and 
(ii?) h(z) = 1 for at least 2” — r Boolean points z. 


Recall that if q is prime, then Lemma 6.5 yields a (q, t)-satisfactory polynomial hy. Furthermore, 
the coefficients (a1,...,ay) of hg were obtained by solving a linear system Ba = b (mod q) where 
B was invertible mod q. 

First, suppose K = qf is a prime power. Then by Proposition 6.6, we can “lift” the solution 
a € Z” of Ba = b(modq) to a solution a’ € Z” of Ba’ = v(mod K). Furthermore, after we 
perform this lifting, we still have a, =--- = a/y = 1, since the matrix B has not changed (and 
in particular contains the identity submatrix). So if we set 


N 
hg (x) := y aði (x) 
i=1 


then hx is (K, t)-satisfactory. 

Now let K and L be relatively prime, and suppose we found a (K, r)-satisfactory polynomial hg 
as well as an (L, r')-satisfactory polynomial hz. We want to combine hx and hz into a (KL,r + r’)- 
satisfactory polynomial hz. To do so, we use Chinese remaindering (as in Proposition 6.7) to 
find an affine linear combination 


hg (x) := ahg (x) + bhz (x) — KL 
such that 
(i?) hey (x) = 0 (mod KL) if and only if hg (£) = 0 (mod K) and hz (x) = 0 (mod L), and 
(ii”) if hx (x) = 1 and hz (x) = 1 then hgg (£) = 1. 


Since there are at least 2” — (r +r’) Boolean points z such that hg (z) = hz (z) = 1, this yields 
a (KL,r + r')-satisfactory polynomial as desired. 

Thus, given any composite integer Q = qj'--- 4°", we can first use Hensel lifting to find a 
(q;', t)-satisfactory polynomial hg, for every i € [m], and then use Chinese remaindering to combine 
the hg,’s into a (Q, mt)-satisfactory polynomial hg. m 

We are finally ready to prove the integer analogue of Lemma 4.3. 
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Lemma 6.9 Let y1,..., yz be points in Z”, such that size (y;) < s for alli € |t]. Then for at least 
2” — 2t?s Boolean points w € {0,1}", there exists a multiquadratic polynomial p : Z” — Z such that 


(i) p(yi) =0 for alli € [t], 
(ii) p(w) =1, and 
(itt) p(z) =0 for all Boolean z 4 w. 


Proof. Assume t < 2”, since otherwise the lemma is trivial. 

Let hg : Z” — Z be the multilinear polynomial from Lemma 6.8, for some integer Q to be 
specified later. Then our first claim is that there exists a multilinear polynomial g : Q” — Q, with 
rational coefficients, such that 


(i?) g (yi) = he (yi) for all i € [t], and 


(ii?) g(z) = 0 for at least 2” — t Boolean points z. 


This claim follows from linear algebra: we know the requirements g (y;) = ha (yi) for i € [t] are 
mutually consistent, since there exists a multilinear polynomial, namely hg, that satisfies them. 
So if we write g in the basis of 6,’s, as follows: 


z€{0,1}” 


then condition (i’) gives us t independent affine constraints on the 2” coefficients g (z), for some 
t <t. This means there must exist a solution g such that g(z) = 0 for at least 2" — t’ Boolean 
points z. Let z,..., 2, be the remaining t Boolean points. 

Notice that z,..., 2 can be chosen independently of hg. This is because we simply need 
to find t Boolean points z1,..., 2, such that any “allowed” vector (hg (y1),---,h@ (yt)) can be 
written as a rational linear combination of vectors of the form (ô+, (y1), ---, 8z; (ye) with j € [t]. 

We now explain how Q is chosen. Let T be at x t’ matrix whose rows are labeled by yj,..., Yt, 
whose columns are labeled by 21,..., zy, and whose (i, j) entry equals ô+; (yi). Then since we had 
t’ independent affine constraints, there must be a t x t submatrix I” of I with full rank. We set 
Q := |det (I’)|. 

With this choice of Q, we claim that g is actually an integer polynomial. It suffices to show 
that g (z;) is an integer for all j € [t], since the value of g at any x € Z” can be written as an integer 
linear combination of its values on the Boolean points. Note that the vector (g(z1),...,9 (zy )) 
is obtained by applying the matrix (I” ae to some vector (v1,..., vy) whose entries are hg (y;)’s. 
Now, every entry of (+ has the form k/Q, where k is an integer; and since hg (y;) = 0 (mod Q) 
for all i € [t], every v; is an integer multiple of Q. This completes the claim. 


34 


Also, since size (y;) < s for all i € |t], we have the upper bound 


Q= |det (I’)| 
<t! (ir luil +1 ) 
i=l 
a (io + ») 


i=1 
< flats 
— gtstt logs t 


< g2ts : 


Here the last line uses the assumption that logat < n, together with the fact that n < s. 
Therefore Q can have at most 2ts distinct prime factors. So by Lemma 6.8, we have hg (z) = 1 
for at least 2” — 2t?s Boolean points z. 
Putting everything together, if we define m (x) := hg (x) — g(x), then we get a multilinear 
polynomial m : Z” — Z such that 


(i”) m (yi) = 0 for all i € ft], and 
(ii?) m(z) = 1 for at least 2” — 2t?s Boolean points z. 


Then for any w € {0,1}” with m (w) = 1, we can get a multiquadratic polynomial p : Z” — Z 
satisfying conditions (i)-(iii) of the lemma by taking p(x) := m (x) ôw (x). m 

Lemma 6.9 easily implies a lower bound on the deterministic query complexity of the OR 
function. 


Theorem 6.10 Deo (OR) = (275) for all s. 


Proof. Let Y be the set of points queried by a deterministic algorithm, and assume size (y) < s for 
all y € Y. Then provided 2” — 2 |V|? s > 0 (or equivalently |V| < \/2"-1/s), Lemma 6.9 implies 
that there exists a multiquadratic extension polynomial Â : Z” — Z such that A (y) = 0 for all 
y € V, but nevertheless A(w ) = 1 for some Boolean point w. So even if the algorithm is adaptive, 
we can let Y be the set of points it queries assuming each query is answered with 0, and then find 
A,B such that Aly i B (y) = 0 for all y € Y, but nevertheless A and B lead to different values 
of the OR function. m 

As mentioned before, one can calculate that the polynomial p from Lemma 6.9 satisfies size (p (x)) 
O (size (x) + ts). For algebrization purposes, the key question is whether the dependence on t and 
s can be eliminated, and replaced by some fixed polynomial dependence on size (x) and n. Another 
interesting question is whether one can generalize Lemma 6.9 to queries of unbounded size—that 
is, whether the assumption size (y;) < s can simply be eliminated. 


7 Applications to Communication Complexity 


In this section, we give two applications of our algebrization framework to communication com- 
plexity: 
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(1) A new connection between communication complexity and computational complexity, which 
implies that certain plausible communication complexity conjectures would imply NL 4 NP. 


(2) MA-protocols for the Disjointness and Inner Product problems with total communication cost 
O (./nlog n), essentially matching a lower bound of Klauck [25]. 


Both of these results can be stated without any reference to algebrization. On the other hand, 
they arose directly from the “transfer principle” relating algebrization to communication complexity 
in Section 4.3. 


7.1 Karchmer-Wigderson Revisited 


Two decades ago, Karchmer and Wigderson [24, 42] noticed that certain communication complexity 
lower bounds imply circuit lower bounds—or in other words, that one can try to separate complexity 
classes by thinking only about communication complexity. In this section we use algebrization to 
give further results in the same spirit. 

Let f : {0,1}* x {0,1}% — {0,1} be a Boolean function, and let x and y be inputs to f held 
by Alice and Bob respectively. By an |IP-protocol for f, we mean a randomized communication 
protocol where Alice and Bob exchange messages with each other, as well as with an omniscient 
prover Merlin who knows x and y. The communication cost is defined as the total number of bits 
exchanged among Alice, Bob, and Merlin. If f(x,y) = 1, then there should exist a strategy of 
Merlin that causes Alice and Bob to accept with probability at least 2/3, while if f (x,y) = 0 no 
strategy should cause them to accept with probability more than 1/3. 


Lemma 7.1 Suppose f : {0,1} x {0,1}. — {0,1} is in NL. Then f has an IP-protocol with 
communication cost O (polylog N). 


Proof. Let N = 2". Then we can define a Boolean function A : {0,1}"*t — {0,1}, such that 
the truth table of A (0x) corresponds to Alice’s input, while the truth table of A (1x) corresponds 
to Bob’s input. Taking n as the input length, we then have f € PSPACE4Pov], By Theorem 3.7 
we have PSPACE“IP°'y] C IP4, where A is the multilinear extension of A. Hence f € IP. But 
by Theorem 4.11, this means that f admits an IP-protocol with communication cost O (poly n) = 
O (polylog N). m 

An immediate consequence of Lemma 7.1 is that, to prove a problem is outside NL, it suffices 
to lower-bound its IP communication complexity: 


Theorem 7.2 Let Alice and Bob hold 3SAT instances p4, pp respectively of size N. Suppose 
there is no |P-protocol with communication cost O (polylog N), by which Merlin can convince Alice 
and Bob that pa and pg have a common satisfying assignment. Then NL Æ NP. 


Likewise, to prove a problem is outside P, it suffices to lower-bound its RG communication 
complexity, where RG is the Refereed Games model of Feige and Kilian [12] (with a competing 
yes-prover and no-prover). In this case, though, the EXP = RG theorem is not only algebrizing 
but also relativizing, and this lets us prove a stronger result: 


Theorem 7.3 Lety be a3SAT instance of size N. Suppose there is no bounded-error randomized 
verifier that decides whether y is satisfiable by 


(i) making O (polylog N) queries to a binary encoding of p, and 
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(ii) exchanging O (polylog N) bits with a competing yes-prover and no-prover, both of whom know 
y and can exchange private messages not seen by the other prover. 


Then P £ NP. 


Proof. Suppose P = NP. Then by padding, EXP4!P°'yl = NEXP4IP°'l for all oracles A. As 
discussed in Section 3.5, the work of Feige and Kilian [12] implies that EXP4P°'’] = RGA for 
all oracles A. Hence NEXP4IP'yl = RG4 as well. In other words, given oracle access to an 
exponentially large 35 AT instance y, one can decide in RG whether y is satisfiable. Scaling down 
by an exponential now yields the desired result. m 


7.2  Disjointness and Inner Product 


In this subsection we consider two communication problems. The first is Disjointness, which was 
defined in Section 4.3. The second is Inner Product, which we define as follows. Alice and Bob 
are given n-bit strings 71...%, and yj)... Yn respectively; then their goal is to compute 


n 
IP (x,y) := `> Tiyi 
i=1 


as an integer. Clearly Disjointness is equivalent to deciding whether IP (x,y) = 0, and hence is 
reducible to Inner Product. 

Klauck [25] showed that any MA-protocol for Disjointness has communication cost Q (vn). The 
“natural” conjecture would be that the yn was merely an artifact of his proof, and that a more 
refined argument would yield the optimal lower bound of Q (n). However, using a protocol inspired 
by our algebrization framework, we are able to show that this conjecture is false. 


Theorem 7.4 There exist MA-protocols for the Disjointness and Inner Product problems, in which 
Alice receives an O (ynlogn)-bit witness from Merlin and an O (ynlogn)-bit message from Bob. 


Proof. As observed before, it suffices to give a protocol for Inner Product; a protocol for Disjoint- 
ness then follows immediately. 
Assume n is a perfect square. Then Alice and Bob can be thought of as holding functions 
a: [Vn] x [vn] — {0,1} and b : [Vn] x [Vn] > {0,1} respectively. Their goal is to compute the 
inner product 
[P= So a(x, y).b (ayy) . 
x.ye[vn] 


Choose a prime q € [n,2n]. Then a and b have unique extensions a : F? — F, and b: F? >F, 
respectively as degree-(y/n — 1) polynomials. Also, define the polynomial s : Fy — Fy by 


























vn 


s(x) := N a(z, y)b(z,y) (mod q). 


y=1 


Notice that deg (s) < 2 (yn — 1). 

Merlin’s message to Alice consists of a polynomial s’ : Fg — F}, which also has degree at most 
2 (y/n — 1), and which is specified by its coefficients. Merlin claims that s = s’. If Merlin is honest, 
then Alice can easily compute the inner product as 


Vn 
iP =>" sa 
ga 1. 
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So the problem reduces to checking that s = s’. This is done as follows: first Bob chooses r € Fy 





uniformly at random and sends it to Alice, along with the value of b (r, y) for every y € [yn]. Then 
Alice checks that 
vn ~ 
s'(r)= 5 a(r,y) b(r,y) (mod q). 
y=1 
If s = s’, then the above test succeeds with certainty. On the other hand, if s Æ s’, then 


Br bosso] < a Oy A 





and hence the test fails with probability at least 2. E 

Let us make two remarks about Theorem 7.4. 7 

First, we leave as an open problem whether one could do even better than O (yn) by using an 
AM-protocol: that is, a protocol in which Alice (say) can send a single random challenge to Merlin 
and receive a response. (As before, the communication cost is defined as the sum of the lengths of 
all messages between Alice, Bob, and Merlin.) On the other hand, it is easy to generalize Theorem 
7.4 to give an MAM-protocol (one where first Merlin sends a message, then Alice, then Merlin) 
with complexity O (ni/ 3 log n). Similarly, one can give an MAMAM -protocol with complexity 
O (ni/4 log n), an MAMAMAM-protocol with complexity O (nt/5 log n), and so on. In the limit of 
arbitrarily many rounds, one gets an IP-protocol with complexity O (log n log log n). 

Second, one might wonder how general Theorem 7.4 is. In particular, can it be extended to 
give an MA-protocol for every predicate f : {0,1}” x {0,1}” — {0,1} with total communication 
O (vn)? The answer is no, by a simple counting argument. 

We can assume without loss of generality that every MA-protocol has the following form: first 
Alice and Bob receive an m-bit message from Merlin; then they exchange T messages between 
themselves consisting of a single bit each. Let p; be the probability that the tt” message is a ‘1’, 
as a function of the n-+m-+t-—1 bits (one player’s input plus Merlin’s message plus t — 1 previous 
messages) that are relevant at the tt? step. It is not hard to see that each p; can be assumed to 


have the form i/n”, where i is an integer, with only negligible change to the acceptance probability. 
n+m+t—-1 
i choices for each function p; : {0,1} 1+1 — [0,1], whence 


m (n’) (n?) 


possible protocols. But if m + T = o (n), this product is still dwarfed by gn the number of 
distinct Boolean functions f : {0,1}” x {0,1}" — {0,1}. 

Thus, Theorem 7.4 has the amusing consequence that the Inner Product function, which is 
often considered the “hardest” function in communication complexity, is actually unusually easy 
for MA-protocols. (The special property of Inner Product we used is that it can be written as a 
degree-2 polynomial in Alice’s and Bob’s inputs.) 


In that case there are (n?) 


gn+m gn+m+1 gn+m+T—1 


8 Zero-Knowledge Protocols 


In searching complexity theory for potentially non-algebrizing results, it seems the main source is 
cryptography—and more specifically, cryptographic results that exploit the locality of computa- 
tion. These include the zero-knowledge protocol for NP due to Goldreich, Micali, and Wigderson 
[16] (henceforth the GMW Theorem), the two-party oblivious circuit evaluation of Yao [45], and 
potentially many others. Here we focus on the GMW Theorem. 
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As discussed in Section 1, the GMW Theorem is inherently non-black-box, since it uses the 
structure of an NP-complete problem (namely 3-Coloring). On the other hand, the way the 
theorem exploits that structure seems inherently non-algebraic: it does not involve finite fields or 
low-degree polynomials. Nevertheless, in this section we will show that even the GMW Theorem 
is algebrizing. 

Let us start by defining the class CZK, or Computational Zero Knowledge, as well as CZKA 
(that is, CZK with oracle access to A). 


Definition 8.1 A language L is in CZK if there exists a protocol in which a probabilistic polynomial- 
time verifier V interacts with a computationally-unbounded prover P, such that for all inputs x the 
following holds. 


e Completeness. Ifx € L then P causes V to accept with probability 1. 


e Soundness. If x ¢ L then no prover P* can cause V to accept with probability more than 
1/2. 


e Zero-Knowledge. If x € L then there exists an expected polynomial-time simulator that, 
given black-box access! to a polynomial-time verifier V*, produces a message transcript that 
cannot be efficiently distinguished from a transcript of an actual conversation between V* and 
P. (In other words, the two probability distributions over message transcripts are computa- 
tionally indistinguishable. ) 


We define CZK4 to mean the version of CZK where all three machines—the prover, verifier, 
and simulator—have access to the oracle A. 


Then Goldreich et al. [16] proved the following: 
Theorem 8.2 (GMW Theorem) If one-way functions exist then NP C CZK. 


It is not hard to show that Theorem 8.2 is non-relativizing. Intuitively, given a black-box 
function f : {0,1}”" — {0,1}, suppose a prover P wants to convince a polynomial-time verifier V 
that there exists a z such that f(z) =1. Then there are two possibilities: either P can cause V 
to query f (z), in which case the protocol will necessarily violate the zero-knowledge condition (by 
revealing z); or else P can not cause V to query f (z), in which case the protocol will violate either 
completeness or soundness. By formalizing this intuition one can show the following: 


Theorem 8.3 There exists an oracle A relative to which 


(i) one-way functions exist (i.e. there exist functions computable in P4 that cannot be inverted 
in BPP“ on a non-negligible fraction of inputs), but 
(ii) NP4 ¢ CZKA. 
By contrast, we now show that the GMW Theorem is algebrizing. In proving this theorem, we 
will exploit the availability of a low-degree extension A to make the oracle queries zero-knowledge. 


13CZK can also be defined in a way that allows non-black-access (i.e., access to the verifier’s source code). However, 
our protocol, like the original GMW one, will only require black-box access. 
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Theorem 8.4 Let A be an oracle and let A be any extension of A. Suppose there exists a one- 
way function, computable in PA, which cannot be inverted with non-negligible probability by BPP“ 
adversaries. Then NP4 C CZK4. 


Proof. We will assume for simplicity that the extension Ais just a polynomial A:F" >F over a 
fixed finite field F, which extends a Boolean function A: {0,1}" — {0,1}. Also, let d = deg(A), 
and assume d < char (F). (The proof easily generalizes to the case where A and A are as defined 
in Section 2.) 

Deciding whether an NP4 machine accepts is equivalent to deciding the satisfiability of a Boolean 
formula y (w1,..., Wm), which consists of a conjunction of two types of clauses: 














(i) Standard 35AT clauses over the variables w1,..., Wm. 


(ii) “Oracle clauses,” each of which has the form y; = A(Y;), where Y; € {0,1}”" is a query to A 
composed of n variables w;,,...,w;,,) and y; is its expected answer (composed of another 
Jı Jn 
variable w;,4,): 


Given such a formula y, our goal is to convince a BPP“ verifier that y is satisfiable, without 
revealing anything about the satisfying assignment w1,...,Wm (or anything else). To achieve 
this, we will describe a constant-round zero-knowledge protocol in which the verifier accepts with 
probability 1 given an honest prover, and rejects with probability Q (1/ poly (n)) given a cheating 
prover. Given any such protocol, it is clear that we can increase the soundness gap to Q (1), by 
repeating the protocol poly (n) times. 

Let us describe our protocol in the case that the prover and verifier are both honest. In the 
first round, the prover uses the one-way function to send the verifier commitments to the following 
objects: 


e A satisfying assignment w),...,Wm for y. 
e A random nonzero field element r € F. 


e For each oracle clause y; = A(Y;), 














— A random affine function L; : F — F” (in other words, a line) such that L; (0) = Y; and 
Li (1) # Yi. 
— A polynomial p; : F > F, of degree at most d, such that p; (t) = A (Li (t)) for all t € F. 


























Given these objects, the verifier can choose randomly to perform one of the following four tests: 


(1) Ask the prover for a zero-knowledge proof that the standard 3SAT clauses are satisfied. 


(2) Choose a random oracle clause y; = A(Y;), and ask for a zero-knowledge proof that L; (0) = 
Yir 


(3) Choose a random oracle clause y; = A (Y;), and ask for a zero-knowledge proof that p; (0) = yi. 


(4) Choose a random oracle clause y; = A(Y;) as well as a random nonzero field element s € F. 
Ask for the value u of L; (rs), as well as a zero-knowledge proof that u = L; (rs). Query 
A (u). Ask for a zero-knowledge proof that p; (rs) = A (u). 
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To prove the correctness of the above protocol, we need to show three things: completeness, 
zero-knowledge, and soundness. 

Completeness: This is immediate. If the prover is honest, then tests (1)-(4) will all pass with 
probability 1. 

Zero-Knowledge: Let V* be any verifier. We will construct a simulator to create a transcript 
which is computationally indistinguishable from its communication with the honest prover P. The 
simulator first chooses random values for the w;’s (which might not be satisfying at all) and commits 
to them. It also commits to a random r € F*. For tests (1)-(3), the simulator acts as in the proof 
of the GMW Theorem [16]. So the interesting test is (4). 

First note that rs is a random nonzero element, regardless of how V™ selected s. Now the key 
observation is that L; (rs), the point at which the verifier queries A, is just a uniform random point 
in F"\ {Y;}. Thus, we can construct a simulator as follows: if the verifier is going to ask the prover 
about an oracle clause y; = A (Y;), then first choose a point X; € F” uniformly at random and query 
A (X;). (The probability that X; will equal Y; is negligible.) Next choose nonzero field elements 
r,s € F uniformly at random. Let L; be the unique line such that L; (0) = Y; and L; (rs) = Xi 
and let p; be the unique degree-d polynomial such that p; (t) = A (L; (t)) for all t € F (which can be 
found by interpolation). Construct commitments to all of these objects. Assuming the underlying 














commitment scheme is secure against BPP“ machines, the resulting probability distribution over 
messages will be computationally indistinguishable from the actual distribution. 

Soundness: Suppose the NP“ machine rejects. Then when the prover sends the verifier a 
commitment to the “satisfying assignment” wy 1,...,Wm, some clause C of y will necessarily be 
unsatisfied. If C is one of the standard 3SAT clauses, then by the standard GMW Theorem, the 
prover will be caught with Q (1/ poly (n)) probability when the verifier performs test (1). So the 
interesting case is that C is an oracle clause y; = A (Y;). 

In this case, since the truth is that y; 4 A(Y;), at least one of the following must hold: 


(i) es pi (0), 
(ii) pi (0) A A (Li (0)), or 
(iii) A (L; (0)) # A(%). 


If (i) holds, then the prover will be caught with Q (1/ poly (n)) probability when the verifier 
performs test (3). 

If (ii) holds, then the two degree-d polynomials p; (t) and A (L; (t)) must differ on at least 
a 1 — d/ char (F) fraction of points t € F. Hence, since rs is a random nonzero element of F 
conditioned only on s being random, the prover will be caught with Q (1/ poly (n)) probability 
when the verifier performs test (4). 

If (iii) holds, then L; (0) 4 Y;. Hence the prover will be caught with Q (1/ poly (n)) probability 
when the verifier performs test (2). m 

Let us make two remarks about Theorem 8.4. 


























(1) Notice that in our zero-knowledge protocol, the prover’s strategy can actually be implemented 


in BPP4, given a satisfying assignment w 1,..., Wm, for the formula g. 


(2) Although our protocol needed poly (n) rounds to achieve constant soundness (or O (1) rounds 
to achieve 1/ poly (n) soundness), we have a variant that achieves constant soundness with 
a constant number of rounds. For the non-oracle part of the protocol, it is well-known how 
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to do this. To handle oracle queries, one composes the polynomially many queries that the 
verifier selects among by passing a low-degree curve through them. This reduces the case (4) 
to a single random query on this curve. We omit the details. 


9 The Limits of Our Limit 


Some would argue with this paper’s basic message, on the grounds that we already have various 
non-relativizing results that are not based on arithmetization. Besides the GMW protocol (which 
we dealt with in Section 8), the following examples have been proposed: 


(1) Small-depth circuit lower bounds, such as AC? 4 TC? [32], can be shown to fail relative to 
suitable oracle gates. 


(2) Arora, Impagliazzo, and Vazirani [3] argue that even the Cook-Levin Theorem (and by ex- 
tension, the PCP Theorem) should be considered non-relativizing. 


(3) Hartmanis et al. [17] cite, as examples of non-relativizing results predating the “interactive 
proofs revolution,” the 1977 result of Hopcroft, Paul, and Valiant [19] that TIME (f (n)) 4 
SPACE (f (n)) for any space-constructible f, as well as the 1983 result of Paul et al. [29] that 
TIME (n) A NTIME (n). Recent time-space tradeoffs for SAT (see van Melkebeek [28] for a 


survey) have a similar flavor. 


There are two points we can make regarding these examples. Firstly, the small-depth circuit 
lower bounds are already “well covered” by the natural proofs barrier. Secondly, because of 
subtleties in defining the oracle access mechanism, there is legitimate debate about whether the 
results listed in (2) and (3) should “truly” be considered non-relativizing; see Fortnow [13] for a 
contrary perspective.!4 

Having said this, we do not wish to be dogmatic. Our results tell us a great deal about the future 
prospects for arithmetization, but about other non-relativizing techniques they are comparatively 
silent. 


10 Beyond Algebrizing Techniques? 


In this section, we discuss two ideas one might have for going beyond the algebrization barrier, and 
show that some of our limitation theorems apply even to these ideas. 


10.1 k-Algebrization 


One of the most basic properties of relativization is transitivity: if two complexity class inclusions 
C C D and D C E both relativize, then the inclusion C C € also relativizes. Thus, it is natural 
to ask whether algebrization is transitive in the same sense. We do not know the answer to this 
question, and suspect that it is negative. However, there is a kind of transitivity that holds. Given 


an oracle A, let a double-extension A of A be an oracle produced by 


(1) taking a low-degree extension A of A, 


MFric Allender has suggested the delightful term “irrelativizing,” for results that neither relativize nor fail to 
relativize. 
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(2) letting f be a Boolean oracle such that f (x,7) is the it? bit in the binary representation of 
A (a), and then 


(3) taking a low-degree extension A of f. 


(One can similarly define a triple-extension A, and so on.) Then the following is immediate: 


Proposition 10.1 For all complexity classes C,D,€, if CeCe DA and DA € EA for all A, A, then 
C4 CEA for all A, A. 


Now, the above suggests one possible approach to defeating the algebrization barrier. Call a 
complexity class inclusion C C D double-algebrizing if C4 C DA for all A, À, triple-algebrizing if 


CA C DA for all A, A, and so on. Then any k-algebrizing result is also (k + 1)-algebrizing, but the 
converse need not hold. We thus get a whole infinite hierarchy of proof techniques, of which this 
paper studied only the first level. 

Alas, we now show that any proof of P ANP will need to go outside the entire hierarchy. 


Theorem 10.2 Any proof of P Æ NP will require techniques that are not merely non-algebrizing, 
but non-k-algebrizing for every constant k. 


Proof. Recall that in Theorem 5.1, we showed that any proof of P # NP will require non- 
algebrizing techniques, by giving oracles A,A such that NP4 = P4 = PSPACE. In that case, A 
was any PSPACE-complete language, while A was the unique multilinear extension of A, which is 


also PSPACE-complete by Babai, Fortnow, and Lund [4]. Now let A be the multilinear extension 
of the binary representation of A. Then A is also PSPACE-complete by Babai et al. Hence 


NPA = = P4 = PSPACE. The same is true inductively for A and so on. E 

Similarly, any proof P Æ PSPACE will require techniques that are non-k-algebrizing for every 
k. 

On the other hand, for most of the other open problems mentioned in this paper—P versus RP, 
NEXP versus P/poly, and so on—we do not know whether double-algebrizing techniques already 


suffice. That is, we do not know whether there exist A, A such that RP“ É på, NEXP4 c P4/poly, 
and so on. Thus, of the many open problems that are beyond the reach of arithmetization, at least 
some could conceivably be solved by “k-arithmetization.” 


10.2 Non-Commutative Algebras 


We have shown that arithmetization— “lifting” Boolean logic operations to arithmetic operations 
over the integers or a field—will not suffice to solve many of the open problems in complexity 
theory. A natural question is whether one could evade our results by lifting to other algebras, 
particularly non-commutative ones. Unfortunately, we now explain why our limitation theorems 
extend with little change to associative algebras with identity over a field. This is a very broad 
class that includes matrix algebras, quaternions, Clifford algebras, and more. The one constraint 
is that the dimension of the algebra (or equivalently, the representation size of the elements) should 
be less than exponential in n.15 


This is similar to the requirement that the integers should not be too large in Section 6. 
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Formally, an algebra over the field F is a vector space V over F, which is equipped with a 
multiplication operation V-V — V such that u(vu+w) = uv + uw for all u,v,w € V. The 
algebra is associative if its multiplication is associative, and has identity if one of its elements is a 
multiplicative identity. The dimension of the algebra is the dimension of V as a vector space. 

A crucial observation is that every k-dimensional associative algebra over F is isomorphic to a 
subalgebra of Mp (F), the algebra of k x k matrices with entries in the field F. The embedding is 
the natural one: every element v € V defines a linear transformation M, via Myx = v- x. 

We will now explain why, for associative algebras with identity, our main results go through 
almost without change. For notational simplicity, we will state our results in terms of the full 
matrix algebra Mp (F), though the results would work just as well for any subalgebra of Mp (F) 
containing the zero and identity elements. 

Given a polynomial p : M; (F)” —> M; (F), call p sorted-multilinear if it has the form 


P(X1,.--,Xn) = > as] [Xa 


SC[n] ics 


















































where the coefficients ag belong to F, and all products are taken in order from X; to Xn. 
Now let J, be the k x k identity matrix and 0; be the all-zeroes matrix. Also, call a point 
Z € M;(F)” Boolean if every coordinate is either I or 0;, and let 





n 
bz (X) := |] [Z:X: + Ur - Z:) Un - X) 
i=1 
be the unique sorted-multilinear polynomial such that ôz (Z) = I, and ôz (W) = 0, for all Boolean 
W £Z. 
Then just as in the commutative case, every sorted-multilinear polynomial m has a unique 
representation in the form 


m(X)= XŠ. mzéz(X) 


ZE{0k,Ik}” 











where mz is a coefficient in F such that m (Z) = mzIx. Also, every Boolean function f : 
{0x, In }" — {0x, Ik} has a unique extension 


F(X)= X f(Z2)8z(X) 


ZE{On Le }” 





as a sorted-multilinear polynomial. 

Provided k = O (poly (n)), it is easy to show that any proof of P #4 NP will require “non- 
commutatively non-algebrizing techniques.” Once again, we can let A be any PSPACE-complete 
language, and let A be the unique sorted-multilinear extension of A over Mp (F). Then the 
observations of Babai, Fortnow, and Lund [4] imply that A is also computable in PSPACE, and 


hence NP4 = P4 = PSPACE. 

We can also repeat the separation results of Sections 4 and 5 in the non-commutative setting. 
Rather than going through every result again, we will just give one illustrative example. We 
will show that, given a non-commutative extension A : Mg (F)” — M; (F) of a Boolean function 
A: {0k, Ik} — {0%,J,}, any deterministic algorithm needs Q (2"/k?) queries to A to find a 
Boolean point W € {0x, J, }" such that A(W) = Ip. (Note that switching from fields to k x k 
matrix algebras causes us to lose a factor of k? in the bound.) 

The first step is to prove a non-commutative version of Lemma 4.2. 
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Lemma 10.3 Let Y1,...,¥; be any points in M,(F)". Then there exists a sorted-multilinear 
polynomial m : Mp (F)" — M; (F) such that 


(i) m(Y;) = 0, for alli € |t], and 
(ii) m(W) = Ix for at least 2” — k?t Boolean points W € {0p, Ip}”. 


Proof. If we represent m as 


m(X)= XŠ. mzéz(X), 


ZE{Ox LK }” 


then the constraint m(Y;) = Op for all i € [t] corresponds to kt linear equations relating the 2” 
coefficients mz. By basic linear algebra, it follows that there must be a solution in which at least 
2” — kt of the coefficients are equal to Iz, and hence m (W) = Ip for at least 2” — k?t Boolean 
points W. m 

Using Lemma 10.3, we can also prove a non-commutative version of Lemma 4.3. 











Lemma 10.4 Let Y;,...,Y; be any points in My (F)". Then for at least 2” — kt Boolean points 
W € {0z,1,}", there exists a multiquadratic polynomial p : My (F)" —> M; (F) such that 





(i) p(¥i) = 0, for alli € [t], 
(vi) p(W) = Ip, and 
(itt) p(Z) = Og for all Boolean Z AW. 














Proof. Let m : My (F)” — Mp (F) be the sorted-multilinear polynomial from Lemma 10.3, and pick 
any Boolean W such that m (W) = J. Then a multiquadratic polynomial p satisfying properties 
(i)-(iii) can be obtained from m as follows: 


p(X) := m(X) dw (X). 


E 
Lemma 10.4 immediately gives us a non-commutative version of Theorem 4.4, the lower bound 
on deterministic query complexity of the OR function. 


Theorem 10.5 Dm, @).2 (OR) =Q (2”/k°) for every matrix algebra Mp (F). 


By using Theorem 10.5, for every k = O (poly (n)) one can construct an oracle A, and a k x k 


matrix extension A of A, such that Np“ g P4. This then implies that any resolution of the P 
versus NP problem will require “non-commutatively non-algebrizing techniques.” 


11 Conclusions and Open Problems 


Arithmetization is one of the most powerful ideas in complexity theory. It led to the IP = PSPACE 
Theorem, the PCP Theorem, non-relativizing circuit lower bounds, and many other achievements 
of the last two decades. Yet we showed that arithmetization is fundamentally unable to resolve 
many of the barrier problems in the field, such as P versus NP, derandomization of RP, and circuit 
lower bounds for NEXP. 
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Can we pinpoint what it is about arithmetization that makes it incapable of solving these 
problems? In our view, arithmetization simply fails to “open the black box wide enough.” In 
a typical arithmetization proof, one starts with a polynomial-size Boolean formula y, and uses y 
to produce a low-degree polynomial p. But having done so, one then treats p as an arbitrary 
black-box function, subject only to the constraint that deg (p) is small. Nowhere does one exploit 
the small size of y, except insofar as it lets one evaluate p in the first place. The message of this 
paper has been that, to make further progress, one will have to probe y in some “deeper” way. 

To reach this conclusion, we introduced a new model of algebraic query complexity, which has 
already found independent applications in communication complexity, and which has numerous 
facets to explore in its own right. 

We now propose five directions for future work, and list some of the main open problems in 
each direction. 

(1) Find non-algebrizing techniques. This, of course, is the central challenge we leave. 

If arithmetization—which embeds the Boolean field Fə into a larger field or the integers—is not 
enough, then a natural idea is to embed Fə into a “richer” algebra. But in Section 10.2 we showed 
that for every subexponential k, the algebra of k x k matrices is still not “sufficiently rich.” So the 
question arises: what other useful algebraic structures can mathematics offer complexity theory? 

Another possible way around the algebrization barrier is “recursive arithmetization” : first arith- 
metizing a Boolean formula, then reinterpreting the result as a Boolean function, then arithmetizing 
that function, and so on ad infinitum. In Section 10.1, we showed that k-arithmetization is not 
powerful enough to prove P Æ NP for any constant k. But we have no idea whether double- 
arithmetization is already powerful enough to prove P = RP or NEXP ¢ P/poly. 

(2) Find ways to exploit the structure of polynomials produced by arithmetization. 


























This is also a possible way around the algebrization barrier, but seems important enough to deserve 
its own heading. The question is: given that a polynomial A:F" = F was produced by arithmetiz- 
ing a small Boolean formula, does A have any properties besides low degree that a polynomial-time 
algorithm querying it could exploit? Or alternatively, do there exist “pseudorandom extensions” 
A : F” — F—that is, low-degree extensions that are indistinguishable from “random” low-degree 




















extensions by any BPP“ machine, but that were actually produced by arithmetizing small Boolean 
formulas? As a hint of how the structure of A might be exploited, let us point out that, if A was 
produced by arithmetizing a 3S AT formula y, then one can actually recover p in polynomial time 
by making oracle queries to A.'® 

(3) Find open problems that can still be solved with algebrizing techniques. In the 
short term, this is perhaps the most “practical” response to the algebrization barrier. Here is a 
problem that, for all we know, might still be solvable with tried-and-true arithmetization methods: 
improve the result of Santhanam [36] that PromiseMA ¢ SIZE (n*) to MA ¢ SIZE (n*). 


©The algorithm is as follows. Assume for simplicity that every clause of y contains exactly 3 literals. Then for 
each triple of variables (£4, £j, £k): 


(1) Set the remaining n — 3 variables to constants. 


(2) Query A on O(deg(A)?) values of x:,£j,£k, then use interpolation to recover a trivariate polynomial 
p (xi, Tj, £k). 

(3) Try to divide p by each of the eight multilinear polynomials (1 — £i£j£k, 1— xix; (1 — £k), etc.) corresponding 
to the possible 3S AT clauses involving £i, £j, £k. Whenever such a polynomial divides p, we have learned 
another clause of y. 


The only thing to be careful about is that, in setting the remaining n — 3 variables to constants, we do not 
inadvertently set p (xi, £j, £k) =0. But this is easily arranged. 
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(4) Prove algebraic oracle separations. Can we show that the interactive protocol of 
Lund, Fortnow, Karloff, and Nisan [27] cannot be made constant-round by any algebrizing tech- 


nique? In other words, can we give an oracle A and extension A such that coNP4 É AM4? In 
the communication complexity setting, Klauck [25] mentions coNP versus AM as a difficult open 
problem; perhaps the algebraic query version is easier. 

The larger challenge is to give algebraic oracles that separate all the levels of the polyno- 
mial hierarchy—or at least separate the polynomial hierarchy from larger classes such as P#? and 
PSPACE.!” In the standard oracle setting, these separations were achieved by Furst-Saxe-Sipser 
[15] and Yao [44] in the 1980’s, whereas in the communication setting they remain notorious open 
problems. Again, algebraic query complexity provides a natural intermediate case between query 
complexity and communication complexity. 

Can we show that non-algebrizing techniques would be needed to give a Karp-Lipton collapse 
to MA? Or give an interactive protocol for coNP where the prover has the power of NP? 

Can_we show that a BQP4 or MA machine needs exponentially many queries to the extension 
oracle A, not only to solve the Disjointness problem, but also just to find a Boolean point x such 
that A (x) =1? Also, in the integers case, can we show that a P4 machine needs exponentially 
many queries to A to find an x such that A(z) = 1? (That is, can we remove the technical 
limitations of Theorem 6.10 regarding the size of the inputs and outputs?) 

(5) Understand algebrization better. In defining what it meant for inclusions and_sepa- 
rations to algebrize, was it essential to give only one machine access to the extension oracle A, and 
the other access to A? Or could we show (for example) not only that coN P4 C IP4, but also that 
coNP4 C IP4? What about improving the separation PP“ É SIZE4 (n*) to PP4 É SIZE4 (n*)? 


Likewise, can we improve the separation MAp P4 /poly to Mado] É P4 /poly? 


Are there complexity classes C and D that can be separated by a finite field extension A, but 
not by an integer extension A? Are there complexity classes that can be separated in the algebraic 
oracle setting, but not the communication setting? 

Low-degree extensions can be seen as just one example of an error-correcting code. To what 
extent do our results carry over to arbitrary error-correcting codes? 

Arora, Impagliazzo, and Vazirani [3] showed that contrary relativizations of the same statement 
(for example, P4 = NP“ and PB 4 NP®) can be interpreted as proving independence from a certain 
formal system. Can one interpret contrary algebrizations the same way? 
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17Tf the oracle A only involves a low-degree extension over F4, for some fixed prime q = o(n/logn), then we can 
give A, A such that PP“ ¢ PHÊ. The idea is the following: let A be the unique multilinear extension of A over F4. 
Clearly a PP“ machine can decide whether re {o,1}" A(x) > 2”. On the other hand, supposing a PH* machine 
solved the same problem, we could interpret the universal quantifiers as AND gates, the existential quantifiers as 
OR gates, and the queries to A as summation gates modulo q. We could thereby obtain an AC? [q] circuit of size 
2Pely(") | which computed the Boolean MAJORITY given an input of size 2” (namely, the truth table of A). But 
when q = o (n/ logn), such a constant-depth circuit violates the celebrated lower bound of Smolensky [38]. 

Unfortunately, the above argument breaks down when the field size is large compared to n—as it needs to be for 
most algorithms that would actually exploit oracle access to A. Therefore, it could be argued that this result is not 
“really” about algebrization. 
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